Acta Phys. -Chim. Sin. ›› 2012, Vol. 28 ›› Issue (10): 2249-2257.doi: 10.3866/PKU.WHXB201209171


An Empirical Additive Model for Aqueous Solubility Computation: Success and Limitations

DUAN Bao-Gen, LI Yan, LI Jie, CHENG Tie-Jun, WANG Ren-Xiao   

  1. State Key Laboratory of Bioorganic Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Science, Shanghai 200032, P. R. China
  • Received:2012-07-16 Revised:2012-09-12 Published:2012-09-26
  • Supported by:

    The project was supported by the National Natural Science Foundation of China (81172984, 21072213, 21002117, 21102168, 21102165) and National High-Tech Research and Development Program of China (863) (2012AA020308).


We have developed a new empirical model, namely XLOGS, for computing aqueous solubility (logS) of organic compounds. This model is essentially an additive model, which employs a total of 83 atom/ group types and three correction factors as descriptors. Furthermore, it computes the logS value of a query compound by using the known logS value of an appropriate reference molecule as a starting point. XLOGS was calibrated on a training set of 4171 compounds with known logS values. The squared correlation coefficient (R2) and standard deviation (SD) in regression were 0.82 and 0.96 log units, respectively. The entire training set was further split into one subset containing liquid compounds only and another subset containing solid compounds only. Regression results of XLOGS were obviously better on the former subset (SD=0.65 vs 0.94 log units). The difference between log1/S and logP (partition coefficient, the ratio of concentrations of a compound in a mixture of water and n-octanol at equilibrium) was used as an indicator to investigate the performance of XLOGS on liquid compounds and solid compounds. Our results suggested that an additive model like XLOGS performed most satisfactorily when this difference was close to zero. Three other logS models, including Qikprop, MOE-logS, and ALOGPS, were also compared with XLOGS on an independent test set of 132 drug-like compounds. Put together, our study provides some general guidance for applying additive models to computation of aqueous solubility.

Key words: Aqueous solubility, Additive model, XLOGS