Predictive+Solubility

This page will link to predictive solubility models. Since data is being continuously added I expect that we will develop multiple models over time. Follow the links below to view model summaries and details. Models are available via REST services, so automated predictions should become easy.

Using the Models
Batch Prediction (instructions on how to locally run batches of compounds through MeltingPointModel002) @ONSModels-R (our self contained R script for easy calculation of predicted properties).

Aqueous Solubility Prediction
ASM001 (an open model predicting the aqueous solubility of a large dataset of kinetically measured solubility values from PubChem) ASM002 (ORU student projects: predicting aqueous solubility from SMILES using 2D CDK descriptors)

Melting Point Modeling
Alfa Aesar and Karthikeyan (comparing the two largest data sets of melting point data) MeltingPointModel001 (predicting melting point from SMILES using the CDK) MeltingPointModel002 (original RF based webservice - deprecated) EPISuite (MPBPWIN™ vs MPModel002) MeltingPointModel003 (Bergstrom using random forest) MeltingPointModel004 (predicting melting point from SMILES using 2D CDK descriptors on a highly curated set of melting points) MeltingPointModel005 (predicting melting point from SMILES using 2D CDK descriptors - without ALogP or ALogp2 - on a highly curated set of melting points) MeltingPointModel006 (predicting melting point from SDF using 2D+3D CDK descriptors on a highly curated set of melting points) MeltingPointModel007 (predicting melting point from SMILES using the CDK on a dataset of 20152 unique compounds) MeltingPointModel008 (predicting melting point from SMILES using OCHEM) MeltingPointModel009 (predicting melting point from SMILES and deploying on the QsarDB open repository) [top model, but currently too large to deploy] @MeltingPointModel010 (predicting melting point from SMILES on a highly curated set of melting points and deploying on the QsarDB open repository) - webservice @MeltingPointModel011 (modeling melting points using OpenBabel) @MeltingPointModel012 (using the Open Source Prediction Engine with PMML melting point models)

Melting Point Model Testing
MPMTest001 (Models 4, 6, and 7 applied to ortho vs. para and cis vs. trans)

Temperature Dependence
Solubility at any temperature from one (or more) solubility value(s) (includes: converting mole fraction to molarity and the Buchowski equation)

Abraham General Solvation Model
Abraham Descriptor Explorer (web interface to compounds with known Abraham descriptors - includes substructure search) Benzoic acid (includes a general description of methods) Clustering (a set of good solvents for synthesis) Ugi Product Intermediate of Praziquantel (ab initio calculation of the optimal solvent system for crystallization) 1,2,4-triazole derivatives (predicting the solubility of derivatives by modifying the Abraham descriptors of unsubstituted cores) boc-serine (calculating the Abraham descriptor for boc-serine by modifying the Abraham descriptors of boc-glycine) boc-protected amino acids (calculating the Abraham descriptors of boc-protected amino acids by modifying the Abraham descriptors of boc-glycine) benzoic acid derivatives (comparing the accuracy of measured Abraham descriptors, model001, and the modified Platts fragment method) cinnamic acid (Calculating Abraham descriptors by regressing known solubilities against Abraham's general solvation equations)

Modeling Abraham Solute Descriptors
AbrahamDescriptorsModel001 (CDK based least squares model for Abraham descriptors S, A, and B - webservice - this model has been superseded by model003)
 * Other webservices based on Model001:**
 * Solubility from Abraham descriptors web service
 * Optimal solvent selection for reactions (now uses model003)
 * Solubility prediction via simple url

AbrahamDescriptorsModel002a (CDK based random forest models for Abraham descriptors) AbrahamDescriptorsModel002b (CDK based random forest models for Abraham descriptors - parallel analysis to model002a) @AbrahamDescriptorsModel003 (CDK based random forest models for Abraham descriptors - webservice)
 * Domain of Apllicability ADM003

Abraham Descriptor Based logP
@ADlogP001 (An open LFER logP model based upon Abraham Descriptors)

Modeling Abraham Solvent Coefficients
AbrahamSolventsModel001a CDK-descriptor based models of Abraham solvent coefficients - run1 AbrahamSolventsModel001b CDK-descriptor based models of Abraham solvent coefficients - run2 AbrahamSolventsModel001c CDK-descriptor based models of Abraham solvent coefficients - run3 (outlier identification) AbrahamSolventsModel002 CDK-descriptor based models of Abraham solvent coefficients AbrahamSolventsModel003 Exploring 'safe' solvents @AbrahamSolventsModel004 Sustainable Solvents

Predicting Solubility from SMILES
SolubilityModel001 (predicting solubilty in methanol) [OBSOLETE] SolubilityModel002 (predicting solubility in methanol) SolubilityModel003 (predicting solubility in any solvent - webservice ) SolubilityModel004 (general solute - predicting solubility in methanol - webservice ) SolubilityModel005 (carboxylic acids in methanol - webservice )

General Solubility Model
GeneralSolubilityModel

Real-Time Regression
Ugi product precipitation prediction using logistic regression:
 * Methanol (default).
 * THF (solvent controlled via url).
 * Probability of precipitation (SMILES passed via url).

**Prediction and Hypothesis Formulation using DMax Chemistry Assistant**
DMax001-002 (predicting solubility and solvation hypotheses in methanol)