Predictive Solubility

This page will link to predictive solubility models. Since data is being continuously added I expect that we will develop multiple models over time. Follow the links below to view model summaries and details. Models are available via REST services, so automated predictions should become easy.

Using the Models

Batch Prediction (instructions on how to locally run batches of compounds through MeltingPointModel002)
ONSModels-R (our self contained R script for easy calculation of predicted properties).

Aqueous Solubility Prediction

ASM001 (an open model predicting the aqueous solubility of a large dataset of kinetically measured solubility values from PubChem)
ASM002 (ORU student projects: predicting aqueous solubility from SMILES using 2D CDK descriptors)

Melting Point Modeling

Alfa Aesar and Karthikeyan (comparing the two largest data sets of melting point data)
MeltingPointModel001 (predicting melting point from SMILES using the CDK)
MeltingPointModel002 (original RF based webservice - deprecated)
EPISuite (MPBPWIN™ vs MPModel002)
MeltingPointModel003 (Bergstrom using random forest)
MeltingPointModel004 (predicting melting point from SMILES using 2D CDK descriptors on a highly curated set of melting points)
MeltingPointModel005 (predicting melting point from SMILES using 2D CDK descriptors - without ALogP or ALogp2 - on a highly curated set of melting points)
MeltingPointModel006 (predicting melting point from SDF using 2D+3D CDK descriptors on a highly curated set of melting points)
MeltingPointModel007 (predicting melting point from SMILES using the CDK on a dataset of 20152 unique compounds)
MeltingPointModel008 (predicting melting point from SMILES using OCHEM)
MeltingPointModel009 (predicting melting point from SMILES and deploying on the QsarDB open repository) [top model, but currently too large to deploy]
MeltingPointModel010 (predicting melting point from SMILES on a highly curated set of melting points and deploying on the QsarDB open repository) - webservice
MeltingPointModel011 (modeling melting points using OpenBabel)
MeltingPointModel012 (using the Open Source Prediction Engine with PMML melting point models)

Melting Point Model Testing

MPMTest001 (Models 4, 6, and 7 applied to ortho vs. para and cis vs. trans)

Temperature Dependence

Solubility at any temperature from one (or more) solubility value(s) (includes: converting mole fraction to molarity and the Buchowski equation)

Abraham General Solvation Model

Abraham Descriptor Explorer (web interface to compounds with known Abraham descriptors - includes substructure search)
Benzoic acid (includes a general description of methods)
Clustering (a set of good solvents for synthesis)
Ugi Product Intermediate of Praziquantel (ab initio calculation of the optimal solvent system for crystallization)
1,2,4-triazole derivatives (predicting the solubility of derivatives by modifying the Abraham descriptors of unsubstituted cores)
boc-serine (calculating the Abraham descriptor for boc-serine by modifying the Abraham descriptors of boc-glycine)
boc-protected amino acids (calculating the Abraham descriptors of boc-protected amino acids by modifying the Abraham descriptors of boc-glycine)
benzoic acid derivatives (comparing the accuracy of measured Abraham descriptors, model001, and the modified Platts fragment method)
cinnamic acid (Calculating Abraham descriptors by regressing known solubilities against Abraham's general solvation equations)

Modeling Abraham Solute Descriptors

AbrahamDescriptorsModel001 (CDK based least squares model for Abraham descriptors S, A, and B - webservice - this model has been superseded by model003)
Other webservices based on Model001:

AbrahamDescriptorsModel002a (CDK based random forest models for Abraham descriptors)
AbrahamDescriptorsModel002b (CDK based random forest models for Abraham descriptors - parallel analysis to model002a)
AbrahamDescriptorsModel003 (CDK based random forest models for Abraham descriptors - webservice)

Abraham Descriptor Based logP

ADlogP001 (An open LFER logP model based upon Abraham Descriptors)

Modeling Abraham Solvent Coefficients

AbrahamSolventsModel001a CDK-descriptor based models of Abraham solvent coefficients - run1
AbrahamSolventsModel001b CDK-descriptor based models of Abraham solvent coefficients - run2
AbrahamSolventsModel001c CDK-descriptor based models of Abraham solvent coefficients - run3 (outlier identification)
AbrahamSolventsModel002 CDK-descriptor based models of Abraham solvent coefficients
AbrahamSolventsModel003 Exploring 'safe' solvents
AbrahamSolventsModel004 Sustainable Solvents

Predicting Solubility from SMILES

SolubilityModel001 (predicting solubilty in methanol) [OBSOLETE]
SolubilityModel002 (predicting solubility in methanol)
SolubilityModel003 (predicting solubility in any solvent - webservice )
SolubilityModel004 (general solute - predicting solubility in methanol - webservice )
SolubilityModel005 (carboxylic acids in methanol - webservice )

General Solubility Model

GeneralSolubilityModel

Real-Time Regression

Ugi product precipitation prediction using logistic regression:

Prediction and Hypothesis Formulation using DMax Chemistry Assistant

DMax001-002 (predicting solubility and solvation hypotheses in methanol)