Skip to main content
Try Wikispaces Classroom now.
Brand new from Wikispaces.
Pages and Files
list of experiments
Solubility book (3rd Edn)
Using OCHEM to model melting points
Andrew S.I.D. Lang
Using the default settings on OCHEM (
) to model melting points.
OCHEM  is a web-based modeling service that allows users to upload data and to build models using a variety of descriptors and methods. Some methods are still in beta but they have a dedicated team working on things. Any models created do not show up in the list of public models until they have been approved - after they have appeared in a peer reviewed publication. However, models that you create can be accessed by direct link and used to create web services.
We will use OCHEM to create and publish a model (and webservice) using the same datasets as used in
The first three attempts to create a model using CDK descriptors and R-based random forests failed. This is likely due to the R-based random forest method being 'experimental'. It is our hope that this will be fixed in the future because using R-based random forests is our preferred non-linear modeling technique.
A first successful model
was created using all the default settings (ANN and E-State descriptors) except we chose bagging as the validation method under advice from Igor Tetko (personal correspondence) a developer of OCHEM who has used OCHEM to publish his logP and logS model
A second model using CDK descriptors
was created using default settings except for bagging and CORINA for 3D coordinate generation.
Comparison of the models created, summarized in the table below, we see that ONSMP008b, created using ANN instead of RF, has a higher AAE than ONSMP007. This matches the findings of O'Boyle
 whose best model also was RF-based.
OCHEM is a good tool for creating and distributing models, however It is still in the development stage and crashes often during model creation. Using ANN with CDK descriptors and bagging seems to be the best technique at this stage.
[Attempting to export model ONSMP008a as an Excel file does not allow exporting more than 100 SMILES (the limit for each month). This makes it effectively useless for sharing chemical information publicly. JCB]
 Sushko I, Novotarskyi S, Körner R, Pandey AK, Rupp M, Teetz W, Brandmaier S, Abdelaziz A, Prokopenko VV, Tanchuk VY, Todeschini R, Varnek A, Marcou G, Ertl P, Potemkin V, Grishina M, Gasteiger J, Schwab C, Baskin II, Palyulin VA, Radchenko EV, Welsh WJ, Kholodovych V, Chekmarev D, Cherkasov A, Aires-de-Sousa J, Zhang QY, Bender A, Nigsch F, Patiny L, Williams A, Tkachenko V, Tetko IV.
Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information.
J Comput Aided Mol Des. 2011; 25(6):533-54
 Noel M O'Boyle, David S Palmer, Florian Nigsch1 and John BO Mitchell. Simultaneous feature selection and parameter optimisation using
an artificial ant colony: case study of melting point prediction. Chemistry Central Journal 2008, 2:21 doi:10.1186/1752-153X-2-21
help on how to format text
Turn off "Getting Started"