Solvent Coefficients For Alternative 'Safe' Solvents

Researchers: Jean-Claude Bradley and Andrew SID Lang
All content, models and data are released as CC0 - the default license for all our ONS work.

Objective

To investigate the predicted solvation properties of solvents deemed safe by the EPA. These solvents will be compared to solvents with known Abraham solvent coefficients with the outlook of both potentially replacing existing solvents with safer solvents and to find potential new safe solvents to investigate that reside in a part of the chemical space currently not occupied by solvents with known Abraham coefficients.

Procedure

The Abraham general solvation model uses the LFER

log P = c + e E + s S + a A + b B + v V

where c,e,s,a,b,v are the solvent coefficients and E,S,A,B,V are the solute descriptors, see this brief discussion of the model. The Abraham coefficients are found via linear regression from measured data. The standard procedure is to allow the c-coefficient (the intercept) to float in the linear regression. It has been suggested that c should not be negative[1]. We suggest that little predictive ability will be lost if we just require c to be zero. This will also allow easier comparison between solvents. Thus in order to compare both current solvents with each other and potential new solvents with current solvents, we decided to re-calculate the coefficients for known solvents e_0, s_0, a_0, b_0, v_0 by making c zero. This was achieved by calculating the log P values in 90 solvents for 2144 compounds (2144compounds.csv) with known Abraham descriptors from our Open Abraham Descriptors Database and then re-running the linear regression using R. The following code with results is typical:
setwd(".../MakingCZero")
mydata = read.csv(file="makingczeroreadyforR.csv",head=TRUE,row.names="csid")
fit <- lm(isopropyl.myristate ~ 0 + E + S + A + B + V,data=mydata)
## summary of fit
summary(fit)
 
[output]
Call:
lm(formula = isopropyl.myristate ~ 0 + E + S + A + B + V, data = mydata)
 
Residuals:
     Min       1Q   Median       3Q      Max
-0.55191 -0.25598 -0.13732  0.00069  1.78549
 
Coefficients:
   Estimate Std. Error t value Pr(>|t|)
E  0.977259   0.011781   82.95   <2e-16 ***
S -1.294959   0.014814  -87.41   <2e-16 ***
A -1.870114   0.020493  -91.26   <2e-16 ***
B -4.017729   0.015120 -265.73   <2e-16 ***
V  3.939081   0.007844  502.19   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
Residual standard error: 0.2503 on 2139 degrees of freedom
Multiple R-squared:  0.9958,    Adjusted R-squared:  0.9958
F-statistic: 1.009e+05 on 5 and 2139 DF,  p-value: < 2.2e-16
[output]
The following table lists the original solvent coefficients together with the c=0 adjusted coefficients. Not surprisingly, the largest changes in coefficient values occur for solvents with c-values furthest away from zero. What is a little intriguing is that all the coefficients move consistently that same way. That is, solvents with negative c-values all saw an increase in e and b (and a decrease in s,a, and v) when recalculation was performed, whereas solvents with positive c-values all saw an increase in s,a, and v (and decrease in e and b). By multiplying the average absolute deviation by the average descriptor value gives a measure of the degree by which the coefficients were changed. The adjusted coefficients changed (as measured by e.g. AAE(v_0) * Mean(V)) in the order v (0.124), s (0.043), e (0.013), b (0.011), a (0.010).
c
e
s
a
b
v
solvent
e_0
s_0
a_0
b_0
v_0
0.17
0.4
-1.01
0.06
-3.96
4.04
1-butanol
0.387596
-0.97209
0.108258
-3.97885
4.12895
0.22
0.27
-0.57
-2.92
-4.88
4.46
1-chlorobutane
0.254833
-0.516892
-2.847674
-4.910816
4.57048
-0.06
0.62
-1.32
0.03
-4.15
4.28
1-decanol
0.6203042
-1.3327395
0.0090799
-4.1464701
4.2497972
0.04
0.4
-1.06
0
-4.34
4.32
1-heptanol
0.3948325
-1.0545913
0.0143256
-4.3468223
4.3352649
0.12
0.71
-1.62
-3.18
-4.8
4.32
1-hexadecene
0.696611
-1.588924
-3.143734
-4.810655
4.38202
0.12
0.49
-1.16
0.05
-3.98
4.13
1-hexanol
0.482763
-1.137261
0.090893
-3.993082
4.190662
-0.03
0.49
-1.04
-0.02
-4.24
4.22
1-octanol
0.4909845
-1.0517092
-0.0336817
-4.2309735
4.2009611
0.15
0.54
-1.23
0.14
-3.86
4.08
1-pentanol
0.52385
-1.193867
0.188379
-3.88273
4.154244
0.14
0.41
-1.03
0.25
-3.77
3.99
1-propanol
0.393466
-0.996062
0.291203
-3.784515
4.05757
0.18
0.29
-0.13
-2.8
-4.29
4.18
1,2-dichloroethane
0.278894
-0.090721
-2.742532
-4.313978
4.274203
0.12
0.35
-0.03
-0.58
-4.81
4.11
1,4-dioxane
0.33675
-0.003994
-0.542209
-4.825876
4.173499
0.1
0.62
-1.8
-3.07
-4.29
4.52
1,9-decadiene
0.606347
-1.771424
-3.037034
-4.304097
4.571908
0.13
0.25
-0.98
0.16
-3.88
4.11
2-butanol
0.242337
-0.945988
0.198721
-3.897972
4.179391
0.19
0.35
-1.13
0.02
-3.57
3.97
2-methyl-1-propanol
0.338508
-1.082867
0.07594
-3.591543
4.064762
0.21
0.17
-0.95
0.33
-4.09
4.11
2-methyl-2-propanol
0.153709
-0.897144
0.397899
-4.111567
4.217508
0.12
0.46
-1.33
0.21
-3.75
4.2
2-pentanol
0.445389
-1.303994
0.243289
-3.759434
4.260301
0.1
0.34
-1.05
0.41
-3.83
4.03
2-propanol
0.334933
-1.025916
0.437909
-3.839249
4.084005
0.32
0.51
-1.69
-3.69
-4.81
4.4
2,2,4-trimethylpentane
0.485433
-1.610035
-3.586041
-4.850718
4.563507
0.07
0.36
-1.27
0.09
-3.77
4.4
3-methyl-1-butanol
0.35352
-1.255543
0.11342
-3.779382
4.43679
0.31
0.31
-0.12
-0.61
-4.75
3.94
acetone
0.286706
-0.04747
-0.508846
-4.792269
4.102844
0.41
0.08
0.33
-1.57
-4.39
3.36
acetonitrile
0.044129
0.423135
-1.4362
-4.443325
3.576285
0.14
0.46
-0.59
-3.01
-4.63
4.49
benzene
0.452175
-0.554143
-2.963555
-4.643338
4.564318
0.1
0.29
0.06
-1.61
-4.56
4.03
benzonitrile
0.277045
0.081936
-1.574291
-4.574904
4.07839
-0.02
0.44
-0.42
-3.17
-4.56
4.45
bromobenzene
0.4369346
-0.4279276
-3.1781219
-4.5563083
4.4367652
0.25
0.26
-0.08
-0.77
-4.86
4.15
butanone
0.236179
-0.022077
-0.68909
-4.886268
4.274532
0.25
0.36
-0.5
-0.87
-4.97
4.28
butyl acetate
0.336024
-0.442788
-0.788152
-5.004535
4.408761
0.05
0.69
-0.94
-3.6
-5.82
4.92
carbon disulfide
0.6819348
-0.9318396
-3.5870443
-5.8248542
4.9458403
0.2
0.52
-1.16
-3.56
-4.59
4.62
carbon tetrachloride
0.506806
-1.112282
-3.496545
-4.619008
4.720644
0.07
0.38
-0.52
-3.18
-4.7
4.61
chlorobenzene
0.3752336
-0.5056045
-3.1613969
-4.7083071
4.6478423
0.19
0.11
-0.4
-3.11
-3.51
4.4
chloroform
0.089413
-0.357874
-3.051291
-3.537934
4.493193
0.16
0.78
-1.68
-3.74
-4.93
4.58
cyclohexane
0.770769
-1.640414
-3.689346
-4.948869
4.659072
0.04
0.23
0.06
-0.98
-4.84
4.32
cyclohexanone
0.2216766
0.0668337
-0.962833
-4.8469389
4.3348761
0.19
0.72
-1.74
-3.45
-4.97
4.48
decane
0.706653
-1.697274
-3.389851
-4.993254
4.571974
0.18
0.39
-0.99
-1.41
-5.36
4.52
dibutyl ether
0.37973
-0.94367
-1.357836
-5.379393
4.614845
0.33
0.3
-0.44
0.36
-4.9
3.95
dibutylformamide
0.275223
-0.3577
0.462459
-4.944045
4.122737
0.32
0.1
-0.19
-3.06
-4.09
4.32
dichloromethane
0.076325
-0.111839
-2.957248
-4.130085
4.487963
0.35
0.36
-0.82
-0.59
-4.96
4.35
diethyl ether
0.329941
-0.737491
-0.477868
-5.000297
4.530001
0.21
0.03
0.09
1.34
-5.08
4.09
diethylacetamide
0.0167
0.139253
1.409338
-5.111165
4.197615
-0.27
0.08
0.21
0.92
-5
4.56
dimethylacetamide
0.104844
0.145499
0.831639
-4.970213
4.418588
-0.31
-0.06
0.34
0.36
-4.87
4.49
DMF
-0.034208
0.27126
0.264139
-4.827615
4.330082
-0.19
0.33
0.79
1.26
-4.54
3.36
DMSO
0.341749
0.745718
1.200253
-4.516908
3.262081
0.11
0.67
-1.64
-3.55
-5.01
4.46
dodecane
0.658583
-1.616828
-3.508611
-5.020509
4.517868
0.22
0.47
-1.04
0.33
-3.6
3.86
ethanol
0.453079
-0.983328
0.396198
-3.623493
3.971272
-0.17
-0.02
0
0.07
-0.37
0.45
ethanol/water(10:90)vol
-0.009316
-0.04156
0.010626
-0.350331
0.365271
-0.25
0.04
-0.04
0.1
-0.83
0.92
ethanol/water(20:80)vol
0.062777
-0.098993
0.01722
-0.800521
0.786631
-0.27
0.11
-0.1
0.13
-1.32
1.41
ethanol/water(30:70)vol
0.127804
-0.160996
0.049426
-1.282852
1.276293
-0.22
0.13
-0.16
0.17
-1.81
1.92
ethanol/water(40:60)vol
0.148332
-0.210817
0.102648
-1.781936
1.804907
-0.14
0.12
-0.25
0.25
-2.28
2.42
ethanol/water(50:50)vol
0.134901
-0.285203
0.207128
-2.257463
2.342294
-0.04
0.14
-0.34
0.29
-2.68
2.81
ethanol/water(60:40)vol
0.1406465
-0.3442878
0.281256
-2.6697239
2.7916452
0.06
0.09
-0.37
0.31
-2.94
3.1
ethanol/water(70:30)vol
0.0794107
-0.3529953
0.331463
-2.9438845
3.1344568
0.17
0.18
-0.47
0.26
-3.21
3.32
ethanol/water(80:20)vol
0.161026
-0.424495
0.314211
-3.233463
3.411426
0.24
0.21
-0.58
0.26
-3.45
3.55
ethanol/water(90:10)vol
0.193477
-0.51767
0.338521
-3.480739
3.669878
0.33
0.37
-0.45
-0.7
-4.9
4.15
ethyl acetate
0.342809
-0.369036
-0.596948
-4.94523
4.318697
0.09
0.47
-0.72
-3
-4.84
4.51
ethylbenzene
0.459437
-0.701228
-2.970828
-4.855741
4.56218
-0.27
0.58
-0.51
0.72
-2.62
2.73
ethylene glycol
0.599449
-0.574819
0.631321
-2.585314
2.5908
0.14
0.15
-0.37
-3.03
-4.6
4.54
fluorobenzene
0.140337
-0.340978
-2.985464
-4.618238
4.611483
-0.17
0.07
0.31
0.59
-3.15
2.43
formamide
0.083307
0.267828
0.536608
-3.131516
2.344771
0.3
0.64
-1.76
-3.57
-4.95
4.49
heptane
0.61919
-1.685129
-3.477189
-4.983132
4.640776
0.09
0.67
-1.62
-3.59
-4.87
4.43
hexadecane
0.659893
-1.59632
-3.559573
-4.880281
4.47815
0.33
0.56
-1.71
-3.58
-4.94
4.46
hexane
0.53342
-1.631725
-3.473425
-4.980698
4.634317
-0.19
0.3
-0.31
-3.21
-4.65
4.59
iodobenzene
0.312539
-0.352762
-3.271785
-4.629052
4.489752
-0.61
0.93
-1.15
-1.68
-4.09
4.25
isopropyl myristate
0.977259
-1.294959
-1.870114
-4.017729
3.939081
0.12
0.38
-0.6
-2.98
-4.96
4.54
m-xylene
0.366587
-0.574078
-2.941283
-4.976929
4.598299
0.28
0.33
-0.71
0.24
-3.32
3.55
methanol
0.311909
-0.649107
0.329542
-3.354582
3.690751
0.35
0.22
-0.15
-1.04
-4.53
3.97
methyl acetate
0.194997
-0.067588
-0.923983
-4.571216
4.15239
0.34
0.31
-0.82
-0.62
-5.1
4.43
methyl tert-butyl ether
0.279699
-0.737134
-0.510026
-5.139775
4.600429
0.25
0.78
-1.98
-3.52
-4.29
4.53
methylcyclohexane
0.762327
-1.924196
-3.439318
-4.323834
4.654703
0.28
0.13
-0.44
1.18
-4.73
3.86
N-ethylacetamide
0.105071
-0.374993
1.269385
-4.764184
4.002187
0.22
0.03
-0.17
0.94
-4.59
3.73
N-ethylformamide
0.016449
-0.114321
1.004651
-4.616979
3.84333
-0.03
0.7
-0.06
0.01
-4.09
3.41
N-formylmorpholine
0.6981457
-0.0694897
0.0048883
-4.0885654
3.38906
0.06
0.33
0.26
1.56
-5.04
3.98
N-methyl-2-piperidone
0.3271873
0.2705115
1.5746338
-5.0436057
4.0124292
0.09
0.21
-0.17
1.31
-4.59
3.83
N-methylacetamide
0.19721
-0.150831
1.334533
-4.600626
3.879615
0.11
0.41
-0.29
0.54
-4.09
3.47
N-methylformamide
0.397604
-0.260136
0.578616
-4.099689
3.529845
0.15
0.53
0.23
0.84
-4.79
3.67
N-methylpyrrolidinone
0.519565
0.259902
0.887089
-4.813222
3.749914
-0.2
0.54
0.04
-2.33
-4.61
4.31
nitrobenzene
0.551741
-0.003723
-2.388352
-4.584066
4.213974
0.02
-0.09
0.79
-1.46
-4.36
3.46
nitromethane
-0.0933342
0.7985957
-1.4544755
-4.3676129
3.4722537
0.24
0.62
-1.71
-3.53
-4.92
4.48
nonane
0.599859
-1.65665
-3.456521
-4.951366
4.605711
0.08
0.52
-0.81
-2.88
-4.82
4.56
o-xylene
0.511059
-0.793315
-2.857401
-4.831364
4.601817
-0.1
0.15
-0.84
-0.44
-4.04
4.13
octadecanol
0.155261
-0.863525
-0.466854
-4.028093
4.075935
0.23
0.74
-1.84
-3.59
-4.91
4.5
octane
0.719433
-1.785636
-3.512058
-4.936095
4.620999
0.17
0.48
-0.81
-2.94
-4.87
4.53
p-xylene
0.463092
-0.772761
-2.885801
-4.895116
4.617725
0.57
0.72
-1.03
-1.3
-4.51
3.45
peanut oil
0.66965
-0.89221
-1.12075
-4.58151
3.74435
0.37
0.39
-1.57
-3.54
-5.22
4.51
pentane
0.35651
-1.481294
-3.418818
-5.261024
4.703599
0
0.17
0.5
-1.28
-4.41
3.42
propylene carbonate
0.1672359
0.505135
-1.2809844
-4.4080414
3.4234811
0
0.15
0.6
-0.38
-4.54
3.29
sulfolane
0.1468503
0.6009136
-0.3799049
-4.541574
3.2903215
0.22
0.36
-0.38
-0.24
-4.93
4.45
THF
0.345051
-0.331628
-0.167145
-4.96046
4.564853
0.13
0.43
-0.64
-3
-4.75
4.52
toluene
0.420597
-0.614527
-2.961869
-4.763681
4.588524
0.33
0.57
-0.84
-1.07
-4.33
3.92
tributyl phosphate
0.543888
-0.760593
-0.965937
-4.373768
4.087161
0.4
-0.09
-0.59
-1.28
-1.27
3.09
trifluoroethanol
-0.125647
-0.50143
-1.155862
-1.322677
3.290636
0.06
0.6
-1.66
-3.42
-5.12
4.62
undecane
0.5979334
-1.6471654
-3.4017847
-5.1276719
4.6493317
Using the above adjusted coefficients new RF models were created using R (v3.0.0) and Rajarshi Guha's CDK Descriptor Calculator (v1.3.9). First we used R to perform feature selection
library(caret) #for feature selection
setwd(".../MakingCZero")
mydata = read.csv(file="CDKReady4RFeatureSelectection.csv",head=TRUE,row.names="Title")
ncol(mydata)
 
[output]
[1] 207
[output]
 
nzv <-nearZeroVar(mydata) # remove zeros and other small variance columns
mydata <- mydata[, -nzv]
ncol(mydata)
 
[output]
[1] 111
[output]
 
cor.mat = cor(mydata)
## find correlation r > 0.90
highCorr <- findCorrelation(cor.mat, cutoff = .90, verbose = TRUE)
## remove the highly correlated columns
mydata <- mydata[, -highCorr]
ncol(mydata)
 
[output]
[1] 68
[output]
 
write.csv(mydata, file = "CDKFeatureSelected.csv")
Then the models themselves (e_0,s_0,a_0,b_0,v_0) were created using code like
library("randomForest") #for modeling
setwd(".../MakingCZero")
mydata = read.csv(file="CDKReady4Ra.csv",head=TRUE,row.names="Title")
 
mydata.rf <- randomForest(a_0 ~ ., data = mydata,importance = TRUE)
print(mydata.rf)
[output]
Call:
 randomForest(formula = a_0 ~ ., data = mydata, importance = TRUE)
               Type of random forest: regression
                     Number of trees: 500
No. of variables tried at each split: 22
 
          Mean of squared residuals: 0.2272567
                    % Var explained: 91.89
[output]
## get variable importance plot
varImpPlot(mydata.rf,main="Random Forest Variable Importance")
 
## save the model
saveRDS(mydata.rf, file = "arfmodel")
 
## predict using the random forest model
test.predict <- predict(mydata.rf,mydata)
## write the predictions to the working directory
write.csv(test.predict, file = "RFTestPredicta.csv")
The models were used to predict the coefficients of the training set to examine if any of the solvents were outliers. This could indicate that certain solvent coefficients were in need of updating. The solvents which had the largest errors were (the first 5 being especially suspect): trifluoroethanol, carbon disulfide, formamide, isopropyl myristate, ethylene glycol, DMF, octadecanol, DMSO, chloroform, nitromethane, carbon tetrachloride, N-formylmorpholine, methylcyclohexane, sulfolane, N-methylacetamide.

EPA Solvents

SMILES for the potential new safe solvents were extracted from ChemSpider using the CAS and names. Solvent that already have measured coefficients plus Fatty acids (C16-18 and C18-unsatd., methyl esters), (Glycerides, mixed decanoyl and octanoyl), (Soybean oil, methyl esters), (Tripropylene glycol n-butyl ether), (White mineral oil, petroleum), (Fatty acids, C12-18, methyl esters), and Polypropylene glycol.

CDK descriptors were then calculated which in turn allowed us to predict the solvent coefficients:
List Call
CAS
List Name
solvent SMILES
solvent CSID
e_0p
s_0p
a_0p
b_0p
v_0p
Green [Circle]
107-41-5
2-Methyl-2,4-pentanediol
C(C(C)(C)O)C(C)O
13884973
0.286
-0.584
-0.082
-3.820
3.973
Green [Circle]
107-88-0
1,3-Butanediol
C(C(C)O)CO
13837670
0.414
-0.656
0.180
-3.759
3.927
Green [Circle]
57-55-6
1,2-Propanediol
C(C(C)O)O
13835224
0.386
-0.452
0.253
-3.450
3.590
Green [Circle]
107-98-2
1-Methoxy-2-propanol
C(C(C)O)OC
7612
0.314
-0.645
0.034
-3.545
3.935
Green [Circle]
56-81-5
Glycerol
C(C(CO)O)O
733
0.405
-0.433
0.069
-3.422
3.476
Green [Circle]
110-98-5
1,1'-Dimethyldiethylene glycol
C(C(O)C)OCC(O)C
7796
0.356
-0.356
-0.217
-3.615
3.848
Green [Circle]
25265-71-8
Dipropylene glycol
C(C(O)OC(CC)O)C
30467
0.368
-0.375
-0.356
-4.018
3.983
Green [Circle]
88917-22-0
Propanol 1 (or 2)-2-methoxymethyl ethoxy, acetate
C(C(OC(=O)C)OCCCOC)C
2299699
0.358
-0.226
-0.853
-4.453
4.016
Green [Circle]
56539-66-3
3-Methyl-3-methoxybutanol
C(C(OC)(C)C)CO
55953
0.311
-0.636
-0.195
-3.886
4.001
Green [Circle]
106-65-0
Dimethyl succinate
C(C(OC)=O)CC(OC)=O
13848341
0.342
0.052
-0.778
-4.438
3.832
Green [Circle]
627-93-0
Dimethyl adipate
C(C(OC)=O)CCCC(OC)=O
11824
0.347
-0.153
-0.938
-4.456
3.881
Green [Circle]
20324-32-7
1-(2-Methoxy-1-methylethoxy)-2-propanol
C(C(OCC(C)O)C)OC
23782
0.332
-0.540
-0.259
-3.537
3.838
Green [Circle]
108-65-6
Propylene glycol methyl ether acetate
C(C)(=O)OC(COC)C
7658
0.273
-0.034
-0.702
-4.508
4.005
Green [Circle]
5131-66-8
Propylene glycol n-butyl ether
C(CCC)OCC(C)O
19942
0.424
-0.629
-0.297
-3.632
4.062
Green [Circle]
1119-40-0
Dimethyl glutarate
C(CCCC(=O)OC)(=O)OC
13605
0.342
-0.099
-0.855
-4.422
3.849
Green [Circle]
504-63-2
1,3-Propanediol
C(CO)CO
13839553
0.434
-0.626
0.231
-3.732
3.601
Green [Circle]
34590-94-8
Dipropylene glycol methyl ether
C(OC(CO)C)C(OC)C
23783
0.308
-0.337
-0.252
-3.650
3.926
Green [Circle]
1569-01-03
1-Propoxy-2-propanol
C(OCCC)C(C)O
14551
0.417
-0.580
-0.238
-3.607
3.988
Green [Circle]
14035-94-0
Pentanedioic acid, 2-methyl-, 1,5-dimethyl ester
CC(C(=O)OC)CCC(=O)OC
105525
0.344
-0.144
-0.783
-4.365
3.879
Green [Circle]
108-32-7
Propylene carbonate
CC1COC(O1)=O
7636
0.218
0.300
-1.024
-4.347
3.581
Half Green [Circle]
4437-85-8
1,3-Dioxolan-2-one, 4-ethyl-
C(C1OC(=O)OC1)C
96547
0.263
0.085
-0.850
-4.477
3.864
Half Green [Circle]
931-40-8
4-Hydroxymethyl-1,3-dioxolan-2-one
C(C1OC(=O)OC1)O
88417
0.282
0.082
-0.587
-3.530
3.529
Half Green [Circle]
97-64-3
Ethyl lactate
C(OC(C(C)O)=O)C
13837423
0.241
-0.067
-0.402
-3.764
3.962
Yellow [Triangle]
5989-27-5
D-limonene
[C@H]1(C(=C)C)CC=C(CC1)C
20939
0.558
-1.297
-3.188
-4.832
4.527
Yellow [Triangle]
29911-28-2
Dipropylene glycol monobutyl ether
C(C(OCC(C)O)C)OCCCC
23142
0.464
-0.715
-0.589
-3.693
3.983
Yellow [Triangle]
112-34-5
Diethylene glycol mono-N-butyl ether
C(C)CCOCCOCCO
13839549
0.461
-0.549
-0.362
-3.569
3.813
Yellow [Triangle]
112-53-8
1-Dodecanol
C(CCCCCCCCCCC)O
7901
0.517
-1.170
-0.181
-4.132
4.148
Yellow [Triangle]
25498-49-1
Propanol, [2-(2-methoxymethylethoxy)methylethoxy]-
COCCCOCCCOC(CC)O
30564
0.426
-0.508
-0.459
-3.829
3.894
By calculating the distance to each solvent with known coefficients - sqrt(sum((measured-predicted)/measuredSD)^2) - we identified possible solvent replacements that are predicted to have similar solvation properties:
Current Solvent
Possible Alternate Solvent
Distance
1-octanol
1-dodecanol
0.295
ethanol
1,3-butanediol
0.576
1-propanol
1,3-butanediol
0.585
acetone
propylene glycol methyl ether acetate
0.499
methyl acetate
propylene glycol methyl ether acetate
0.502
benzonitrile
propylene glycol methyl ether acetate
0.576
1,4-dioxane
propylene glycol methyl ether acetate
0.677
methanol
1,2-propanediol
0.517
methanol
1-(2-methoxy-1-methylethoxy)-2-propanol
0.574
methanol
1-methoxy-2-propanol
0.617
methanol
glycerol
0.722
2,2,4-trimethylpentane
D-limonene
0.619
hexane
D-limonene
0.621
Principle component analysis, in R, was used to help visualize where both current and potential new solvents lie in the chemical space.
setwd(".../MakingCZero")
mydata = read.csv(file="EPA-PCA4R.csv",head=TRUE,row.names="Title")
pc1 <- prcomp(mydata, scale. = T)
x <- pc1$x
summary(pc1)
 
[output]
Importance of components:
                          PC1    PC2    PC3     PC4    PC5
Standard deviation     1.7181 1.0367 0.7622 0.56519 0.2702
Proportion of Variance 0.5904 0.2150 0.1162 0.06389 0.0146
Cumulative Proportion  0.5904 0.8053 0.9215 0.98540 1.0000
[output]

Results

Solvents recommended to be updated with high priority: trifluoroethanol, carbon disulfide, formamide, isopropyl myristate, ethylene glycol.

Possible alternative solvents
Current Solvent
Possible Alternate Solvent
Distance
1-octanol
1-dodecanol
0.295
ethanol
1,3-butanediol
0.576
1-propanol
1,3-butanediol
0.585
acetone
propylene glycol methyl ether acetate
0.499
methyl acetate
propylene glycol methyl ether acetate
0.502
benzonitrile
propylene glycol methyl ether acetate
0.576
1,4-dioxane
propylene glycol methyl ether acetate
0.677
methanol
1,2-propanediol
0.517
methanol
1-(2-methoxy-1-methylethoxy)-2-propanol
0.574
methanol
1-methoxy-2-propanol
0.617
methanol
glycerol
0.722
2,2,4-trimethylpentane
D-limonene
0.619
hexane
D-limonene
0.621

Possible new safe solvents in a new part of the chemical space: 4-Hydroxymethyl-1,3-dioxolan-2-one and Ethyl lactate.


References

[1] Paul C.M. van Noort. Solvation thermodynamics and the physical–chemical meaning of the constant in Abraham solvation equations. Chemosphere (2011), doi:10.1016/j.chemosphere.2011.11.073