Comparing the accuracy of measured Abraham descriptors, model001, and the modified Platts fragment method


benzoic acid





2-chloro-5-nitrobenzoic acid
2-methoxybenzoic acid
2-methylbenzoic acid
4-acetamidobenzoic acid
4-aminobenzoic acid





4-chloro-3-nitrobenzoic acid
4-fluorobenzoic acid
4-hydroxybenzoic acid
4-methoxybenzoic acid
4-nitrobenzoic acid

Researcher

Andrew Lang

Objective

To compare the accuracy of the modified Platts fragment method (MPFM), model001, and measured Abraham Descriptors (measuredAD) to predict the solubilities of benzoic acid derivatives in organic solvents.
compound
SMILES
CSID
E
S
A
B
V
data pts
benzoic acid
c1ccc(cc1)C(=O)O
238
0.921
0.849
0.314
0.422
0.932
25
2-chloro-5-nitrobenzoic acid
O=C(O)c1cc(ccc1Cl)[N+]([O-])=O
16359
1.407
1.409
0.487
0.674
1.228
20
2-methoxybenzoic acid
O=C(O)c1ccccc1OC
10892
0.903
1.346
0.436
0.640
1.131
22
2-methylbenzoic acid
O=C(O)c1ccccc1C
8070
0.936
0.785
0.271
0.475
1.073
20
4-acetamidobenzoic acid
O=C(Nc1ccc(cc1)C(=O)O)C
18177
1.433
1.360
0.788
0.930
1.329
4
4-aminobenzoic acid
O=C(O)c1ccc(N)cc1
953
1.308
1.079
0.498
0.745
1.031
24
4-chloro-3-nitrobenzoic acid
O=[N+]([O-])c1cc(ccc1Cl)C(=O)O
7044
1.407
1.414
0.514
0.536
1.228
20
4-fluorobenzoic acid
O=C(O)c1ccc(F)cc1
9579
0.805
0.946
0.367
0.393
0.949
5
4-hydroxybenzoic acid
c1cc(ccc1C(=O)O)O
132
1.183
0.860
0.922
0.591
0.990
18
4-methoxybenzoic acid
COc1ccc(cc1)C(O)=O
10181338
0.903
1.306
0.436
0.560
1.131
18
4-nitrobenzoic acid
O=[N+]([O-])c1ccc(C(=O)O)cc1
5882
1.254
1.412
0.525
0.445
1.106
21

Background

The Abraham descriptors of solutes can be calculated by regressing known solubility values against Abraham's general solvation equations; and with these descriptors it is possible to predict the solubility of the solute in over seventy different organic solvents[Abraham99]. If solubility data is unavailable, then Abraham descriptors can be approximated from models; either by summing fragment contributions[Platts99] or via a QSAR model - such as model001. A third way to approximate Abraham descriptors for a compound is to start with the known Abraham descriptors of a structurally similar compound and modify them by converting to compound with known Abraham descriptors into the compound with unknown Abraham descriptors fragment by fragment, keeping track of how the Abraham descriptors change according to Platts' values for the fragments - in effect a modified Platts fragment method (MPFM). We have used this method to calculate the Abraham descriptors for 1,2,4-triazole derivatives from the known Abraham descriptors of 1,2,4-triazole and also for boc-protected amino acids from the known Abraham descriptors for boc-glycine. However, the accuracy of this method for both the 1,2,4-triazole derivatives and the boc-protected amino acids can only be determined by confirming the predictions by actual measurement of solubility values - something that is in progress at the time of writing (2010-12-29), see Exp194 and Exp195.

To better understand the reliability of the MPFM, we can use it to approximate the Abraham descriptors of certain benzoic acid derivatives, ones that already have measured Abraham descriptors (measuredAD), from the known Abraham descriptors of benzoic acid. We can then compare the solubility predictions against measured solubility values of all the methods: measuredAD, model001, the Platts fragment method, and the MPFM. We shall also compare how the accuracy is affected by using a VCC predicted aqueous solubility value in the solvation equations instead of a measured aqueous solubility value.

Calculating the Abraham Descriptors

The measured Abraham descriptors of benzoic acid and the benzoic acid derivatives are given in the table above. These values were calculated by regressing measured solubility values against Abraham's general solvation equations, excluding measurements in non-polar solvents (alkanes, 1,4-dioxane, benzene, carbon disulfide, carbon tetrachloride, and toluene) to avoid any possible issues with potential dimerization. The number of data points used in the regression can be found in the table above. From these values, we can calculate the MPFM Abraham descriptors (S, A, & B. E & V can always be calculated from structure) of the benzoic acid derivatives by converting benzoic acid into the derivatives fragment-wise and keeping track of the changes using the values for the fragments given in Platts' paper. Computations can be aided by the PaDEL-Descriptor software.

S: the solute dipolarity/polarizability

compound
S-measuredAD
S-MPFM
S-Platts
S-model001
benzoic acid
0.849
N/A
0.934
1.127
2-chloro-5-nitrobenzoic acid
1.409
1.172
1.257
1.739
2-methoxybenzoic acid
1.346
1.010
1.095
1.153
2-methylbenzoic acid
0.785
0.825
0.910
1.070
4-acetamidobenzoic acid
1.360
1.782
1.867
1.644
4-aminobenzoic acid
1.079
1.283
1.368
1.223
4-chloro-3-nitrobenzoic acid
1.414
1.129
1.214
1.738
4-fluorobenzoic acid
0.946
0.900
0.985
1.145
4-hydroxybenzoic acid
0.860
1.147
1.232
1.145
4-methoxybenzoic acid
1.306
1.010
1.095
1.153
4-nitrobenzoic acid
1.412
1.039
1.124
1.720
AAE
N/A
0.253
0.243
0.251
RMSE
N/A
0.280
0.274
0.259
DeltaS.png
Graphing the difference between the values of S for each derivative and the value of S for benzoic acid using all three methods, we see that both the Platts method and model001 correctly predict the direction that S should change, with comparable accuracy. The absolute accuracy of the MPFM for predicting S is not significantly different from that of the Platts model. This is because the Platts model predicts the core benzoic acid value for S with good accuracy (0.934 as compared to 0.849, a difference of only 0.085).

A: the hydrogen bond acidity

compound
A-measuredAD
A-MPFM
A-Platts
A-model001
benzoic acid
0.314
N/A
0.591
0.496
2-chloro-5-nitrobenzoic acid
0.487
0.399
0.676
0.594
2-methoxybenzoic acid
0.436
0.314
0.591
0.256
2-methylbenzoic acid
0.271
0.314
0.591
0.397
4-acetamidobenzoic acid
0.788
0.223
0.500
0.697
4-aminobenzoic acid
0.498
0.561
0.838
0.486
4-chloro-3-nitrobenzoic acid
0.514
0.454
0.731
0.590
4-fluorobenzoic acid
0.367
0.369
0.646
0.520
4-hydroxybenzoic acid
0.922
0.953
1.230
0.684
4-methoxybenzoic acid
0.436
0.314
0.591
0.312
4-nitrobenzoic acid
0.525
0.369
0.646
0.612
AAE
N/A
0.125
0.237
0.119
RMSE
N/A
0.198
0.249
0.133
DeltaA.png
Graphing the difference between the values of A for each derivative and the value of A for benzoic acid using all three methods, we can see how both the Platts method and model001 predict the change in A. We see that the Platts method predicts the correct sign for the change (or no change) in A except for 4-acetamidobenzoic acid whereas model001 predicts the correct sign for the change in A except for 4-aminobenzoic acid; and both 2-methoxybenzoic acid and 4-methoxybenzoic acid. Examining the table above, we see that the MPFM does better (interestingly, model001 does well here too) than the Platts model in predicting the values of A. This is because the Platts model does well in predicting the change in A but doesn't do so well in predicting the initial value of A for benzoic acid (0.591 as compared to 0.314, a difference of 0.277).

B: the hydrogen bond basicity

compound
B-measuredAD
B-MPFM
B-Platts
B-model001
benzoic acid
0.422
N/A
0.459
0.524
2-chloro-5-nitrobenzoic acid
0.674
0.163
0.200
0.641
2-methoxybenzoic acid
0.640
0.629
0.666
0.665
2-methylbenzoic acid
0.475
0.418
0.455
0.547
4-acetamidobenzoic acid
0.930
0.880
0.917
0.985
4-aminobenzoic acid
0.745
0.686
0.723
0.746
4-chloro-3-nitrobenzoic acid
0.536
0.240
0.277
0.645
4-fluorobenzoic acid
0.393
0.411
0.448
0.549
4-hydroxybenzoic acid
0.591
0.718
0.755
0.543
4-methoxybenzoic acid
0.560
0.629
0.666
0.643
4-nitrobenzoic acid
0.445
0.220
0.257
0.616
AAE
N/A
0.142
0.133
0.075
RMSE
N/A
0.207
0.192
0.092
DeltaB.png
Graphing the difference between the values of B for each derivative and the value of B for benzoic acid using all three methods, we can see how both the Platts method and model001 predict the change in B. We see that the Platts method predicts the correct sign (with good accuracy) for the change in B except for 2-methylbenzoic acid and the three nitro-derivative compounds: 2-chloro-5-nitrobenzoic acid, 4-chloro-3-nitrobenzoic acid, and 4-nitrobenzoic aicd. This is because Platts gives a large negative value (-0.525) to the -NO2 fragment, fragment 24 in table 2 in Platts paper. model001 predicts the correct sign for the change in B except for 4-fluorobenzoic acid. Examining the table above, we see that the accuracy of the MPFM for predicting B is not significantly different from that of the Platts model. This is because the Platts model predicts the core benzoic acid value for B with good accuracy (0.459 as compared to 0.422, a difference of only 0.037). model001 does very well in predicting the value of B with an AAE of 0.075 and RMSE of 0.092.

Calculating the Solubilities

The solubilities of the benzoic acid derivatives were calculated using the solubility from Abraham descriptors webservice for each of the methods for estimating the Abraham descriptors: measuredAD (calculated using linear regression), MPFM (modified Platts fragment model - adding and subtracting fragment values from the core benzoic acid measuredAD values), Platts (ab initio fragment method - calculated using PaDEL-Explorer), model001 (ab initio model using CDK descriptors). Columns without the suffix VCC use a measured aqueous solubility value and columns with the suffix VCC use a predicted aqueous solubility value provided by VCC.
Solute
Solvent
measured
measuredAD
MPFM
MPFM-VCC
Platts
Platss-VCC
model001
model001-VCC
2-chloro-5-nitrobenzoic acid
1-butanol
0.78
0.68
7.96
7.96
7.96
4.93
0.43
0.03
2-chloro-5-nitrobenzoic acid
1-decanol
0.39
0.59
7.96
7.96
7.96
5.34
0.30
0.02
2-chloro-5-nitrobenzoic acid
1-heptanol
0.54
0.61
7.96
7.96
7.96
5.74
0.36
0.02
2-chloro-5-nitrobenzoic acid
1-hexanol
0.64
0.73
7.96
7.96
7.96
6.02
0.47
0.03
2-chloro-5-nitrobenzoic acid
1-octanol
0.48
0.57
7.96
7.96
7.96
5.24
0.39
0.02
2-chloro-5-nitrobenzoic acid
1-pentanol
0.74
0.76
7.96
7.96
7.96
5.95
0.45
0.03
2-chloro-5-nitrobenzoic acid
1-propanol
0.95
0.98
7.96
7.96
7.96
7.64
0.70
0.04
2-chloro-5-nitrobenzoic acid
2-butanol
0.91
0.95
7.96
6.94
7.96
4.61
0.70
0.04
2-chloro-5-nitrobenzoic acid
2-methyl-1-propanol
0.63
0.71
7.96
6.68
7.96
4.41
0.49
0.03
2-chloro-5-nitrobenzoic acid
2-methyl-2-propanol
1.21
0.69
7.96
7.86
7.96
5.71
0.57
0.03
2-chloro-5-nitrobenzoic acid
2-pentanol
0.76
0.74
7.96
7.71
7.96
4.93
0.42
0.02
2-chloro-5-nitrobenzoic acid
2-propanol
1.13
0.91
7.96
7.96
7.96
7.10
0.71
0.04
2-chloro-5-nitrobenzoic acid
3-methyl-1-butanol
0.68
0.64
7.96
6.74
7.96
4.02
0.37
0.02
2-chloro-5-nitrobenzoic acid
butyl acetate
0.46
0.61
7.96
7.96
7.96
7.19
0.54
0.03
2-chloro-5-nitrobenzoic acid
dibutyl ether
0.10
0.06
7.96
5.29
7.96
1.07
0.04
0.00
2-chloro-5-nitrobenzoic acid
diethyl ether
0.57
0.56
7.96
7.96
7.96
7.96
0.43
0.03
2-chloro-5-nitrobenzoic acid
ethanol
1.40
1.40
7.96
7.96
7.96
7.24
1.04
0.06
2-chloro-5-nitrobenzoic acid
ethyl acetate
0.75
0.84
7.96
7.96
7.96
7.96
0.82
0.05
2-chloro-5-nitrobenzoic acid
methyl acetate
0.89
1.08
7.96
7.96
7.96
6.46
1.17
0.07
2-chloro-5-nitrobenzoic acid
THF
2.97
2.86
7.96
7.96
7.96
7.96
3.24
0.19
2-methoxybenzoic acid
1-butanol
0.57
0.41
1.16
0.77
0.70
0.47
0.56
0.37
2-methoxybenzoic acid
1-decanol
0.22
0.25
0.89
0.59
0.49
0.33
0.39
0.26
2-methoxybenzoic acid
1-heptanol
0.34
0.33
1.04
0.69
0.58
0.39
0.48
0.32
2-methoxybenzoic acid
1-hexanol
0.37
0.41
1.15
0.77
0.69
0.46
0.56
0.37
2-methoxybenzoic acid
1-octanol
0.30
0.32
0.85
0.57
0.48
0.32
0.43
0.28
2-methoxybenzoic acid
1-pentanol
0.47
0.40
1.23
0.82
0.78
0.52
0.53
0.37
2-methoxybenzoic acid
1-propanol
0.76
0.57
1.41
0.94
1.05
0.70
0.67
0.45
2-methoxybenzoic acid
1,4-dioxane
1.53
1.38
2.08
1.38
0.95
0.63
1.45
0.97
2-methoxybenzoic acid
2-butanol
0.50
0.61
1.44
0.96
0.95
0.64
0.76
0.51
2-methoxybenzoic acid
2-methyl-1-propanol
0.41
0.48
1.24
0.82
0.82
0.54
0.62
0.41
2-methoxybenzoic acid
2-methyl-2-propanol
0.53
0.56
1.23
0.82
0.89
0.59
0.62
0.41
2-methoxybenzoic acid
2-pentanol
0.39
0.42
1.32
0.88
0.84
0.56
0.61
0.40
2-methoxybenzoic acid
2-propanol
0.59
0.59
1.32
0.88
1.09
0.73
0.64
0.42
2-methoxybenzoic acid
3-methyl-1-butanol
0.36
0.40
1.22
0.81
0.72
0.48
0.58
0.39
2-methoxybenzoic acid
butyl acetate
0.34
0.42
0.96
0.64
0.33
0.22
0.61
0.40
2-methoxybenzoic acid
chloroform
0.46
0.36
1.38
0.92
0.13
0.09
1.36
0.91
2-methoxybenzoic acid
dibutyl ether
0.04
0.05
0.19
0.12
0.04
0.03
0.11
0.07
2-methoxybenzoic acid
diethyl ether
0.24
0.36
0.94
0.62
0.39
0.26
0.50
0.34
2-methoxybenzoic acid
ethanol
1.19
0.84
1.98
1.31
1.35
0.90
1.04
0.69
2-methoxybenzoic acid
ethyl acetate
0.66
0.58
1.20
0.80
0.46
0.31
0.75
0.50
2-methoxybenzoic acid
methanol
1.82
0.96
1.84
1.23
1.27
0.86
1.10
0.73
2-methoxybenzoic acid
methyl acetate
0.86
0.88
1.58
1.05
0.54
0.36
1.18
0.79
2-methoxybenzoic acid
THF
2.02
1.71
3.00
2.00
1.57
1.05
1.80
1.20
2-methylbenzoic acid
1-heptanol
1.28
1.48
2.54
5.70
1.43
3.20
0.37
0.82
2-methylbenzoic acid
1-hexanol
1.38
1.53
2.63
5.90
1.57
3.51
0.42
0.94
2-methylbenzoic acid
1-octanol
1.16
1.16
2.04
4.56
1.14
2.52
0.32
0.72
2-methylbenzoic acid
1-pentanol
1.49
1.65
2.79
6.25
1.77
3.97
0.44
0.98
2-methylbenzoic acid
1-propanol
1.77
1.71
2.98
6.67
2.21
4.95
0.52
1.22
2-methylbenzoic acid
2-butanol
1.75
1.53
2.55
5.70
1.69
3.79
0.52
1.17
2-methylbenzoic acid
2-methyl-1-propanol
1.28
1.43
2.41
5.40
1.60
3.57
0.45
1.00
2-methylbenzoic acid
2-methyl-2-propanol
2.19
1.38
2.48
5.55
1.80
4.03
0.48
1.07
2-methylbenzoic acid
2-pentanol
1.68
1.71
2.81
6.28
1.79
4.01
0.45
1.02
2-methylbenzoic acid
2-propanol
1.95
1.49
2.64
5.92
2.19
4.89
0.52
1.17
2-methylbenzoic acid
3-methyl-1-butanol
1.34
1.53
2.51
5.63
1.50
2.35
0.41
0.92
2-methylbenzoic acid
butyl acetate
1.17
1.36
2.56
5.74
0.88
1.96
0.38
0.84
2-methylbenzoic acid
dibutyl ether
0.58
0.40
0.73
1.63
0.15
0.33
0.06
0.14
2-methylbenzoic acid
diethyl ether
1.65
1.49
2.83
6.33
1.18
2.64
0.38
0.84
2-methylbenzoic acid
ethanol
2.08
2.10
3.53
7.90
2.42
5.41
0.72
1.62
2-methylbenzoic acid
ethyl acetate
1.49
1.61
3.07
6.87
1.19
2.65
0.49
1.09
2-methylbenzoic acid
methyl acetate
1.51
1.67
3.01
6.75
1.03
2.30
0.59
1.33
2-methylbenzoic acid
THF
2.92
3.65
7.33
8.46
3.84
8.46
1.30
2.92
4-aminobenzoic acid
1-heptanol
0.16
0.23
0.21
0.17
0.12
0.10
0.14
0.11
4-aminobenzoic acid
1-hexanol
0.21
0.28
0.27
0.22
0.16
0.13
0.18
0.14
4-aminobenzoic acid
1-octanol
0.12
0.20
0.20
0.16
0.11
0.09
0.13
0.10
4-aminobenzoic acid
1-pentanol
0.23
0.34
0.30
0.24
0.19
0.16
0.20
0.16
4-aminobenzoic acid
2-butanol
0.30
0.42
0.42
0.34
0.28
0.23
0.28
0.23
4-aminobenzoic acid
2-methyl-1-propanol
0.19
0.31
0.30
0.24
0.20
0.16
0.20
0.16
4-aminobenzoic acid
2-propanol
0.42
0.38
0.40
0.33
0.33
0.27
0.25
0.20
4-aminobenzoic acid
3-methyl-1-butanol
0.18
0.29
0.25
0.20
0.15
0.12
0.18
0.14
4-aminobenzoic acid
acetone
0.69
0.28
0.43
0.35
0.19
0.15
0.25
0.20
4-aminobenzoic acid
benzene
0.00
0.03
0.02
0.02
0.00
0.00
0.02
0.02
4-aminobenzoic acid
chlorobenzene
0.01
0.02
0.02
0.01
0.00
0.00
0.02
0.01
4-aminobenzoic acid
chloroform
0.14
0.08
0.06
0.05
0.01
0.01
0.07
0.06
4-aminobenzoic acid
cyclohexane
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
4-aminobenzoic acid
diethyl ether
0.13
0.14
0.16
0.13
0.07
0.06
0.10
0.08
4-aminobenzoic acid
DMF
0.56
0.63
1.42
1.15
1.26
1.02
0.65
0.52
4-aminobenzoic acid
ethanol
0.80
0.59
0.59
0.48
0.40
0.33
0.39
0.32
4-aminobenzoic acid
ethyl acetate
0.58
0.17
0.23
0.19
0.09
0.07
0.14
0.11
4-aminobenzoic acid
ethylene glycol
1.16
0.80
0.84
0.68
0.95
0.77
0.58
0.47
4-aminobenzoic acid
formamide
0.26
0.22
0.39
0.32
0.47
0.38
0.22
0.18
4-aminobenzoic acid
heptane
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
4-aminobenzoic acid
methanol
1.23
0.58
0.64
0.52
0.44
0.36
0.43
0.35
4-chloro-3-nitrobenzoic acid
1-heptanol
0.25
0.25
7.96
5.45
5.45
3.06
0.04
0.02
4-chloro-3-nitrobenzoic acid
1-hexanol
0.27
0.29
7.96
5.45
5.79
3.25
0.05
0.03
4-chloro-3-nitrobenzoic acid
1-octanol
0.23
0.24
7.96
4.78
4.77
2.67
0.04
0.02
4-chloro-3-nitrobenzoic acid
1-pentanol
0.27
0.29
7.96
5.34
6.05
3.39
0.04
0.02
4-chloro-3-nitrobenzoic acid
1-propanol
0.37
0.38
7.96
5.91
7.82
4.39
0.07
0.04
4-chloro-3-nitrobenzoic acid
2-butanol
0.32
0.33
7.13
4.00
4.74
2.66
0.07
0.04
4-chloro-3-nitrobenzoic acid
2-methyl-1-propanol
0.19
0.26
6.78
3.80
4.48
2.51
0.05
0.03
4-chloro-3-nitrobenzoic acid
2-methyl-2-propanol
0.38
0.28
7.64
4.29
5.55
3.11
0.05
0.03
4-chloro-3-nitrobenzoic acid
2-pentanol
0.30
0.27
7.96
4.54
5.17
2.90
0.04
0.02
4-chloro-3-nitrobenzoic acid
2-propanol
0.39
0.35
7.96
5.00
7.38
4.14
0.07
0.04
4-chloro-3-nitrobenzoic acid
3-methyl-1-butanol
0.24
0.23
6.86
3.85
4.08
2.29
0.04
0.02
4-chloro-3-nitrobenzoic acid
butyl acetate
0.23
0.31
7.96
7.96
4.89
2.74
0.05
0.03
4-chloro-3-nitrobenzoic acid
dibutyl ether
0.05
0.03
3.20
1.80
0.65
0.36
0.00
0.00
4-chloro-3-nitrobenzoic acid
diethyl ether
0.24
0.29
7.96
7.96
6.39
3.59
0.04
0.02
4-chloro-3-nitrobenzoic acid
ethanol
0.55
0.50
7.96
6.10
7.44
4.17
0.10
0.06
4-chloro-3-nitrobenzoic acid
ethyl acetate
0.36
0.42
7.96
7.96
6.83
3.83
0.08
0.04
4-chloro-3-nitrobenzoic acid
methyl acetate
0.44
0.47
7.96
7.38
4.49
2.52
0.11
0.06
4-chloro-3-nitrobenzoic acid
THF
1.95
1.49
7.96
7.96
7.96
7.96
0.31
0.17
4-fluorobenzoic acid
acetonitrile
0.39
0.43
0.38
1.12
0.10
0.30
0.07
0.19
4-fluorobenzoic acid
DMSO
4.19
4.38
3.72
9.43
6.60
9.43
2.14
6.30
4-fluorobenzoic acid
ethanol
0.80
0.89
0.94
2.78
0.64
1.90
0.18
0.54
4-fluorobenzoic acid
methanol
0.76
0.83
0.86
2.55
0.60
1.76
0.21
0.61
4-fluorobenzoic acid
THF
2.49
1.94
1.82
5.38
0.95
2.82
0.28
0.83
4-fluorobenzoic acid
toluene
0.07
0.14
0.14
0.41
0.01
0.03
0.01
0.02
4-hydroxybenzoic acid
1-hexanol
0.95
1.15
0.18
0.39
0.11
0.23
0.88
1.91
4-hydroxybenzoic acid
1-octanol
0.68
0.73
0.12
0.25
0.06
0.14
0.65
1.41
4-hydroxybenzoic acid
1-pentanol
0.84
1.58
0.24
0.51
0.15
0.33
1.01
2.20
4-hydroxybenzoic acid
1-propanol
1.40
2.19
0.38
0.82
0.28
0.61
1.44
3.13
4-hydroxybenzoic acid
2-methyl-1-propanol
1.40
1.30
0.23
0.51
0.15
0.34
0.96
2.09
4-hydroxybenzoic acid
2-propanol
1.76
2.26
0.42
0.91
0.35
0.76
1.44
3.14
4-hydroxybenzoic acid
acetone
1.54
0.48
0.11
0.25
0.05
0.11
1.13
2.46
4-hydroxybenzoic acid
benzene
0.00
0.00
0.00
0.00
0.00
0.00
0.03
0.07
4-hydroxybenzoic acid
butyl acetate
0.45
0.20
0.04
0.08
0.01
0.03
0.44
0.96
4-hydroxybenzoic acid
chloroform
0.00
0.01
0.00
0.01
0.00
0.00
0.06
0.13
4-hydroxybenzoic acid
diethyl ether
0.52
0.40
0.06
0.12
0.02
0.05
0.56
1.22
4-hydroxybenzoic acid
DMF
2.48
2.43
0.82
1.78
0.73
1.58
4.66
9.96
4-hydroxybenzoic acid
DMSO
4.61
9.96
9.35
9.96
9.96
9.96
9.96
9.96
4-hydroxybenzoic acid
ethanol
1.98
2.19
0.44
0.96
0.30
0.66
1.71
3.73
4-hydroxybenzoic acid
ethyl acetate
0.74
0.33
0.06
0.13
0.02
0.05
0.67
1.45
4-hydroxybenzoic acid
ethylene glycol
1.86
2.92
1.06
2.30
1.20
2.62
1.78
3.88
4-hydroxybenzoic acid
hexane
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.00
4-hydroxybenzoic acid
methanol
2.46
1.66
0.43
0.94
0.30
0.65
1.62
3.53
4-hydroxybenzoic acid
toluene
0.01
0.00
0.00
0.00
0.00
0.00
0.02
0.05
4-methoxybenzoic acid
1-heptanol
0.09
0.09
0.10
0.46
0.06
0.26
0.06
0.26
4-methoxybenzoic acid
1-hexanol
0.10
0.10
0.12
0.51
0.07
0.30
0.07
0.30
4-methoxybenzoic acid
1-octanol
0.08
0.08
0.09
0.37
0.05
0.21
0.05
0.23
4-methoxybenzoic acid
1-pentanol
0.10
0.10
0.12
0.54
0.08
0.34
0.07
0.31
4-methoxybenzoic acid
1-propanol
0.13
0.14
0.14
0.62
0.10
0.46
0.09
0.38
4-methoxybenzoic acid
2-butanol
0.13
0.14
0.14
0.63
0.10
0.42
0.09
0.41
4-methoxybenzoic acid
2-methyl-1-propanol
0.08
0.11
0.12
0.54
0.08
0.36
0.08
0.34
4-methoxybenzoic acid
2-methyl-2-propanol
0.17
0.14
0.12
0.54
0.09
0.39
0.08
0.35
4-methoxybenzoic acid
2-propanol
0.14
0.14
0.13
0.58
0.11
0.48
0.08
0.37
4-methoxybenzoic acid
3-methyl-1-butanol
0.08
0.10
0.12
0.53
0.07
0.32
0.07
0.31
4-methoxybenzoic acid
butyl acetate
0.08
0.12
0.10
0.42
0.03
0.14
0.07
0.31
4-methoxybenzoic acid
dibutyl ether
0.02
0.02
0.02
0.08
0.00
0.02
0.01
0.05
4-methoxybenzoic acid
diethyl ether
0.09
0.10
0.09
0.41
0.04
0.17
0.06
0.27
4-methoxybenzoic acid
ethanol
0.20
0.19
0.20
0.87
0.14
0.59
0.13
0.56
4-methoxybenzoic acid
ethyl acetate
0.13
0.16
0.12
0.53
0.05
0.20
0.09
0.39
4-methoxybenzoic acid
THF
0.70
0.47
0.30
1.32
0.16
0.69
0.23
0.99
4-nitrobenzoic acid
1-heptanol
0.08
0.07
1.84
8.79
1.03
7.81
0.01
0.05
4-nitrobenzoic acid
1-hexanol
0.08
0.09
1.87
8.79
1.12
8.46
0.01
0.06
4-nitrobenzoic acid
1-octanol
0.06
0.07
1.65
8.79
0.93
7.02
0.01
0.05
4-nitrobenzoic acid
1-pentanol
0.09
0.09
1.82
8.79
1.15
8.76
0.01
0.06
4-nitrobenzoic acid
1-propanol
0.11
0.12
2.02
8.79
1.50
8.79
0.01
0.10
4-nitrobenzoic acid
2-butanol
0.10
0.11
1.48
8.79
0.97
7.48
0.01
0.10
4-nitrobenzoic acid
2-methyl-1-propanol
0.07
0.08
1.39
8.79
0.92
6.99
0.01
0.07
4-nitrobenzoic acid
2-methyl-2-propanol
0.15
0.10
1.56
8.79
1.14
8.61
0.01
0.09
4-nitrobenzoic acid
2-pentanol
0.08
0.08
1.59
8.79
1.01
7.69
0.01
0.06
4-nitrobenzoic acid
2-propanol
0.11
0.12
1.71
8.79
1.41
8.79
0.01
0.10
4-nitrobenzoic acid
3-methyl-1-butanol
0.07
0.07
1.40
8.79
0.83
6.30
0.01
0.05
4-nitrobenzoic acid
butyl acetate
0.08
0.11
3.12
8.79
1.07
8.08
0.01
0.07
4-nitrobenzoic acid
dibutyl ether
0.02
0.01
0.83
6.27
0.17
1.27
0.00
0.00
4-nitrobenzoic acid
diethyl ether
0.10
0.10
3.20
8.79
1.34
8.79
0.01
0.06
4-nitrobenzoic acid
ethanol
0.14
0.15
2.16
8.79
1.48
8.79
0.02
0.15
4-nitrobenzoic acid
ethyl acetate
0.13
0.16
3.80
8.79
1.47
8.79
0.02
0.11
4-nitrobenzoic acid
methanol
0.18
0.15
1.50
8.79
1.03
7.84
0.03
0.19
4-nitrobenzoic acid
methyl acetate
0.14
0.18
3.09
8.79
1.05
7.99
0.02
0.17
4-nitrobenzoic acid
THF
0.78
0.52
8.79
8.79
5.37
8.79
0.05
0.41

Results

It was our premise that the MPFM would be better at predicting solubilities than both ab initio methods (Platts, model001), though not as well as the optimal method of using measuredAD descriptors. However, for benzoic acid derivatives, using the measuredAD descriptors for benzoic acid as the core values in the MPFM, this turned out not to be the case; as can be seen by a summary of average absolute error (AAE) and root mean square error (RMSE) in the table below.
measure
measuredAD
MPFM
MPFM-VCC
Platts
Platss-VCC
model001
model001-VCC
AAE
0.167
2.348
3.211
1.876
2.327
0.494
0.650
RMSE
0.490
3.694
4.531
3.203
3.579
0.698
0.954
There are several reasons for this. Firstly, the ab inito Platts method does well in predicting the values of S and B for benzoic acid and thus the MPFM values for S and B are not significantly different than the ab initio Platts values, in fact they are slightly worse. The power of the MPFM can be seen with the A descriptor where the ab initio Platts value is not very good but the predicted changes to A are in general good, leading to good MPFM predictions for A. Secondly, the error that the Platts method makes in predicting the value of B for the nitro-substituted benzoic acid derivatives makes a big difference in the predicted solubilities for those compounds and this skews the results a little, as can be seen in the table below, where we have removed those data values. Of course, this can only be done in hindsight.
measure
measuredAD
MPFM
MPFM-VCC
Platts
Platss-VCC
model001
model001-VCC
AAE
0.224
0.615
1.202
0.415
0.727
0.620
0.977
RMSE
0.604
0.970
2.091
0.793
1.312
0.835
1.089
Finally, we note that, as expected, using measured aqueous values instead of a predicted ones gives better solubility predictions.

Data Files

combineddataforwiki.xlsx (combined solubility predictions spreadsheet), combineddataforwikiwithoutnitros.xlsx (solubilities without nitro-substituted compounds), combineddescriptors.xlsx (spreadsheet of a S,A, & B descriptor values), combineddescriptorsforTP.xlsx (descriptor values formatted for Tableau Public - used to create graphs).

References

[Platts99] Platts JA; Butina D; Abraham MH; and Hersey A. Estimation of Molecular Linear Free Energy Relation Descriptors Using a Group Contribution Approach. J. Chem. Inf. Comput. Sci. 1999, 39, 835-845
[Abraham09] Abraham MH, et al. 2009. Prediction of Solubility of Drugs and Other Compounds in Organic Solvents. Journal of Pharmaceutical Sciences. DOI: 10.1002/jps.21922