ONS Models in R


ONSModels-R Version 1.0.
An easy to use zip file and simple instructions for using our models to calculate predicted properties for organic compounds.
Currently the version calculates Abraham solvent coefficients (assuming c=0) using Model002

Instructions

Installation Instructions.
1. The Open Source Statistics Program R - download R
After installing R, you will need to install two packages: rcdk and randomForest
You can install the packages using R itself through the menu (I usually choose a USA mirror). If you prefer you may install the packages using local zip files or in the case of rcdk via GitHub.
20140729R1.png

2. Download tONSModels-R.zip and unzip it in the C:\ directory
That is, once unzipped, all you models and script files should be in the C:\ONSModels-R directory. If you would like to put the directory in a different place you will need to update the setwd command inside the ONSModels-R.R file.

Once you have completed step 2, the script is ready to run.

Usage Instructions
1. Open R and source the script ONSModels-R.R
20140729R2.png

2. Enter a SMILES or a list of SMILES separated by commas (no spaces). Here I calculate the Abraham solvent coefficients for 4-methylpent-3-en-2-one.
20140729R3.png

Code

The following code is released under a CC0 license.
## change to the new directory
setwd("C:/ONSModels-R")
## load the necessary libraries
library(rcdk)
library(randomForest)
## Get user to input SMILEs
SMILESFUN <- function() {
 
    message("Enter SMILES separated by commas")
    x <- readLines(n = 1)
    return(x)
}
 
## get SMILES
smiles <- unlist(strsplit(SMILESFUN(),","))
mols <- parse.smiles(smiles)
 
## load the model for the 'e' coefficient
e.rf <- readRDS("erfmodel")
## load the model for the 's' coefficient
s.rf <- readRDS("srfmodel")
## load the model for the 'a' coefficient
a.rf <- readRDS("arfmodel")
## load the model for the 'b' coefficient
b.rf <- readRDS("brfmodel")
## load the model for the 'v' coefficient
v.rf <- readRDS("vrfmodel")
 
## get descriptors
descNames <- unique(unlist(sapply(get.desc.categories(), get.desc.names)))
allDescs <- eval.desc(mols,descNames)
 
## make predictions
e.predict <- signif(predict(e.rf,allDescs),digits=3)
s.predict <- signif(predict(s.rf,allDescs),digits=3)
a.predict <- signif(predict(a.rf,allDescs),digits=3)
b.predict <- signif(predict(b.rf,allDescs),digits=3)
v.predict <- signif(predict(v.rf,allDescs),digits=3)
 
 
## format the predictions for outputting
results <- data.frame(e.predict,s.predict,a.predict,b.predict,v.predict)
colnames(results) <- c("e","s","a","b","v")
 
## print results to console
print(results)