物理化学学报 >> 2012, Vol. 28 >> Issue (10): 2249-2257.doi: 10.3866/PKU.WHXB201209171

理论与计算化学 上一篇    下一篇

一种计算水溶解度的经验加合模型的适用范围与局限

段宝根, 李嫣, 李婕, 程铁军, 王任小   

  1. 中国科学院上海有机化学研究所, 生命有机化学国家重点实验室, 上海 200032
  • 收稿日期:2012-07-16 修回日期:2012-09-12 发布日期:2012-09-26
  • 通讯作者: 王任小 E-mail:wangrx@mail.sioc.ac.cn
  • 基金资助:

    国家自然科学基金(81172984, 21072213, 21002117, 21102168, 21102165)以及国家高技术研究发展计划(863)项目(2012AA020308)资助项目

An Empirical Additive Model for Aqueous Solubility Computation: Success and Limitations

DUAN Bao-Gen, LI Yan, LI Jie, CHENG Tie-Jun, WANG Ren-Xiao   

  1. State Key Laboratory of Bioorganic Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Science, Shanghai 200032, P. R. China
  • Received:2012-07-16 Revised:2012-09-12 Published:2012-09-26
  • Supported by:

    The project was supported by the National Natural Science Foundation of China (81172984, 21072213, 21002117, 21102168, 21102165) and National High-Tech Research and Development Program of China (863) (2012AA020308).

摘要:

我们发展了一种用于预测有机小分子化合物水溶解度(logS)的经验方法XLOGS. 它本质上是一种加合模型, 采用83种原子/基团类型和3个校正因子作为模型的描述符.该方法还可以根据一个合适的参照分子的logS实验值来计算未知化合物的logS值. 我们将XLOGS模型在由4171个化合物组成的训练集上进行了参数化, 多元线性回归获得的相关系数R2和标准偏差SD分别为0.82和0.96单位. 将该训练集进一步分为仅含液体化合物和仅含固体化合物的两个子集. XLOGS模型在这两个子集上的回归结果显示前者优于后者(标准偏差分别为0.65单位和0.94单位). 还利用log1/S和logP(脂水分配系数)之间的差值来研究XLOGS方法在液体和固体化合物数据集上的表现. 研究结果表明: XLOGS等加合法模型更适合应用于这两者差值接近于0的化合物. 我们还将XLOGS和其他三种流行的logS计算模型(包括Qikprop, MOE-logS和ALOGPS)在一个含有132个类药化合物的独立测试集上进行了比较. 总体而言, 我们的研究结果为加合法模型在水溶解度预测方面的合理应用提供了指导.

关键词: 水溶解度, 加合法模型, XLOGS

Abstract:

We have developed a new empirical model, namely XLOGS, for computing aqueous solubility (logS) of organic compounds. This model is essentially an additive model, which employs a total of 83 atom/ group types and three correction factors as descriptors. Furthermore, it computes the logS value of a query compound by using the known logS value of an appropriate reference molecule as a starting point. XLOGS was calibrated on a training set of 4171 compounds with known logS values. The squared correlation coefficient (R2) and standard deviation (SD) in regression were 0.82 and 0.96 log units, respectively. The entire training set was further split into one subset containing liquid compounds only and another subset containing solid compounds only. Regression results of XLOGS were obviously better on the former subset (SD=0.65 vs 0.94 log units). The difference between log1/S and logP (partition coefficient, the ratio of concentrations of a compound in a mixture of water and n-octanol at equilibrium) was used as an indicator to investigate the performance of XLOGS on liquid compounds and solid compounds. Our results suggested that an additive model like XLOGS performed most satisfactorily when this difference was close to zero. Three other logS models, including Qikprop, MOE-logS, and ALOGPS, were also compared with XLOGS on an independent test set of 132 drug-like compounds. Put together, our study provides some general guidance for applying additive models to computation of aqueous solubility.

Key words: Aqueous solubility, Additive model, XLOGS