AbrahamSolventsModel003a

Recalculating Solvent Coefficient
All content, models and data are released as CC0
 * Researchers:** Jean-Claude Bradley, Michael H Abraham, William E Acree, Jr., and Andrew SID Lang

Objective
To recalculate the Abraham model solvent coefficients when requiring that the c-coefficient equal zero and then create and publish models predicting said coefficients. See model003 for background.

Procedure
The Abraham general solvation model uses the LFER

log P = c + e E + s S + a A + b B + v V

where c,e,s,a,b,v are the solvent coefficients and E,S,A,B,V are the solute descriptors. The Abraham coefficients are found via linear regression from measured data. The standard procedure is to allow the c-coefficient (the intercept) to float in the linear regression. We suggest that little predictive ability will be lost if we just require c to be zero. This will also allow easier comparison between solvents. Thus in order to compare both current solvents with each other and potential new solvents with current solvents, we decided to re-calculate the coefficients for known solvents e_0, s_0, a_0, b_0, v_0 by making c zero. This was achieved by calculating the log P values in over 90 solvents for ???? compounds with known Abraham descriptors from our ((figshare database?????)) and then re-running the linear regression using R. The following code with results is typical: code setwd(".../MakingCZero") mydata = read.csv(file="makingczeroreadyforR.csv",head=TRUE,row.names="csid") fit <- lm(isopropyl.myristate ~ 0 + E + S + A + B + V,data=mydata) summary(fit)
 * 1) summary of fit

[output]

UPDATE BELOW.........................

Call: lm(formula = isopropyl.myristate ~ 0 + E + S + A + B + V, data = mydata)

Residuals: Min      1Q   Median       3Q      Max -0.55191 -0.25598 -0.13732 0.00069  1.78549

Coefficients: Estimate Std. Error t value Pr(>|t|) E 0.977259   0.011781   82.95   <2e-16 *** S -1.294959  0.014814  -87.41   <2e-16 *** A -1.870114  0.020493  -91.26   <2e-16 *** B -4.017729  0.015120 -265.73   <2e-16 *** V 3.939081   0.007844  502.19   <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2503 on 2139 degrees of freedom Multiple R-squared: 0.9958,    Adjusted R-squared:  0.9958 F-statistic: 1.009e+05 on 5 and 2139 DF, p-value: < 2.2e-16 [output] code The following table lists the original solvent coefficients together with the c=0 adjusted coefficients. Not surprisingly, the largest changes in coefficient values occur for solvents with c-values furthest away from zero. What is a little intriguing is that all the coefficients move consistently that same way.

UPDATE BELOW...............

That is, solvents with negative c-values all saw an increase in e and b (and a decrease in s,a, and v) when recalculation was performed, whereas solvents with positive c-values all saw an increase in s,a, and v (and decrease in e and b). By multiplying the average absolute deviation by the average descriptor value gives a measure of the degree by which the coefficients were changed. The adjusted coefficients changed (as measured by e.g. AAE(v_0) * Mean(V)) in the order v (0.124), s (0.043), e (0.013), b (0.011), a (0.010).