Predicting the solubility of boc-protected amino acids by modifying the Abraham descriptors of boc-glycine







boc-glycine
boc-serine
boc-phenylalanine
boc-alaninne
boc-leucine

Researchers

Jean-Claude Bradley, Andrew Lang and Khalid Mirza

Objective

To predict the solubility of boc-protected amino acids in organic solvents by modifying the Abraham descriptors of boc-glycine.
boc-glycine (N-(tert-B​utoxycarb​onyl)glyc​ine)
SMILES: O=C(OC(C)(C)C)NCC(=O)O
CSID: 70660
boc-serine (N-(tert-Butoxycarbonyl)-L-serine)
SMILES: O=C(OC(C)(C)C)N[C@H](C(=O)O)CO
CSID: 89204
boc-phenylalanine (N-(tert-butoxycarbonyl)-L-phenylalanine)
SMILES: O=C(OC(C)(C)C)N[C@H](C(=O)O)Cc1ccccc1
CSID: 70249
boc-alanine (N-[t-Butoxycarbonyl]-L-alanine)
SMILES: O=C(OC(C)(C)C)N[C@H](C(=O)O)C
CSID: 76745
boc-leucine (N-(tert-Butoxycarbonyl)-L-leuc​ine)
SMILES: O=C(OC(C)(C)C)N[C@H](C(=O)O)CC(C)C
CSID: 75037

Background

The solubility of solutes in organic solvents can be predicted if the Abraham descriptors of the solute are known. If not known, the Abraham descriptors themselves can be predicted ab initio either from a QSAR model such as model001 or by summing fragment contributions[Platts99]. A more accurate way of calculating the Abraham descriptors is by regressing known solubility values against Abraham's general solvation equations[Abraham09]. The Abraham descriptors of boc-glycine were calculated from measured solubility values using the linear regression method and can be used to predict the solubility of boc-glycine in over 70 solvents.

At the time of writing, there is limited data for the solubility of other boc-protected amino acids in organic solvents and thus the Abraham descriptors for these compounds cannot be calculated using regression. Also, boc-glycine is a known outlier for model001 and so using model001 to predict the Abraham descriptors for other boc-protected amino acids is not desirable. However, since we know the Abraham descriptors of boc-glycine, we can calculate the Abraham descriptors of the similar boc-protected amino acids fragment-wise. That is, the Abraham descriptors for boc-glycine derivatives can be derived from the base MeasuredAD values of boc-glycine by adding and subtracting fragments converting boc-glycine into each of the derivatives, keeping track of the changes to the Abraham descriptors according to the fragment values in Platts' paper[Platts99].

We have already used this technique to calculate the Abraham descriptors of boc-serine, comparing the predicted solubility of boc-serine in THF and methanol to the measured values. The method seems to work well and we will further test it by measuring the solubilities (and thus Abraham descriptors) of the boc-protected amino acids listed above in various organic solvents to compare with the predicted values given below.

E
S
A
B
V
Notes
boc-glycine
0.402
0.825
0.256
0.789
1.343
calculated by regression of known solubility values
boc-serine
0.552
1.108
0.000
1.107
1.522
calculated previously
boc-phenylalanine
0.899
1.161
0.256
0.866
2.071
see below
boc-alanine
0.387
0.786
0.256
0.807
1.463
see below
boc-leucine
0.372
0.747
0.256
0.825
1.886
see below

Calculations

boc-phenylalanine: To change boc-glycine into boc-phenylalanine fragment-wise we replace >CH2 (Platts Table2:2) with >CH- (Platts Table2:3) and then add back a >CH2 and six times =CH- (Platts Table2:6). This results in changes of E:+0.497, S:0.336, A:+0.000, B:0.077 and predicted Abraham descriptors of E:0.899, S:1.161, A:0.256, B:0.866. These Abraham descriptors can be used to predict the solubility of boc-phenylalanine in over 70 solvents.
boc-alanine: To change boc-glycine into boc-alanine fragment-wise we replace >CH2 (Platts Table2:2) with >CH- (Platts Table2:3) and then add a -CH3 (Platts Table2:1). This results in changes of E:-0.015, S:-0.039, A:+0.000, B:+0.018 and predicted Abraham descriptors of E:0.387, S:0.786, A:0.256, B:0.807, V:1.463. These Abraham descriptors can be used to predict the solubility of boc-alanine in over 70 solvents.
boc-leucine: To change boc-glycine into boc-leucine fragment-wise we replace >CH2 (Platts Table2:2) with >CH- (Platts Table2:3) and then add back a >CH2, a >CH- and two -CH3 (Platts Table2:1). This results in changes of E:-0.030, S:-0.078, A:+0.000, B:+0.036 and predicted Abraham descriptors of E:0.372, S:0.747, A:0.256, B:0.825, V:1.886. These Abraham descriptors can be used to predict the solubility of boc-leucine in over 70 solvents.

Solubilities

Conclusion

References

[Platts99] Platts JA; Butina D; Abraham MH; and Hersey A. Estimation of Molecular Linear Free Energy Relation Descriptors Using a Group Contribution Approach. J. Chem. Inf. Comput. Sci. 1999, 39, 835-845
[Abraham09] Abraham MH, et al. 2009. Prediction of Solubility of Drugs and Other Compounds in Organic Solvents. Journal of Pharmaceutical Sciences. DOI: 10.1002/jps.21922