物理化学学报 >> 2009, Vol. 25 >> Issue (08): 1581-1586.doi: 10.3866/PKU.WHXB20090756

研究论文 上一篇    下一篇

基于支持向量机方法的HERG钾离子通道抑制剂分类模型

李平, 谈宁馨, 饶含兵, 李泽荣, 陈宇综   

  1. 四川大学化学学院, 成都 610065|四川大学化学工程学院, 成都 610065|Department of Pharmacy, National University of Singapore, Singapore 117543
  • 收稿日期:2009-02-20 修回日期:2009-04-20 发布日期:2009-07-16
  • 通讯作者: 李泽荣 E-mail:lizrscu@yahoo.com.cn

Classification Models for HERG Potassium Channel Inhibitors Based on the Support Vector Machine Approach

LI Ping, TAN Ning-Xin, RAO Han-Bing, LI Ze-Rong, Chen Yu-Zong   

  1. College of Chemistry, Sichuan University, Chengdu 610065, P. R. China|College of Chemical Engineering, Sichuan University, Chengdu 610065, P. R. China|Department of Pharmacy, National University of Singapore, Singapore 117543
  • Received:2009-02-20 Revised:2009-04-20 Published:2009-07-16
  • Contact: LI Ze-Rong E-mail:lizrscu@yahoo.com.cn

摘要:

对human ether-a-go-go related genes(HERG)钾离子通道(钾通道)抑制剂, 计算了表征分子组成、电荷分布、拓扑、几何结构及物理化学性质等特征的1559个分子描述符, 采用Fischer Score(F-Score)排序过滤和Monte Carlo模拟退火法相结合从中筛选与HERG钾通道抑制剂分类相关的分子描述符. 采用支持向量机(SVM)方法, 分别以IC50=1.0、10.0 μmol·L-1为分类标准, 建立了三个分类预测模型. 对367个训练集分子, 用五重交叉验证, 得到正、负样本的平均预测精度分别为84.8%-96.6%、80.7%-97.7%, 其总的平均预测精度为87.1%-97.2%, 优于其它文献报道结果. 对97个外部测试集分子, 所建三个模型的总样本预测精度在67.0%-90.1%之间, 接近或优于其它文献报道结果.

关键词: 支持向量机, HERG钾通道抑制剂, Monte Carlo模拟退火法

Abstract:

We calculated 1559 molecular descriptors including constitutional, charge distribution, topological, geometrical, and physicochemical descriptors to characterize the molecular structure of human ether-a-go-go related genes (HERG) potassiumchannel inhibitors. A hybrid filter/wrapper approach combing the Fischer Score (F-Score) and Monte Carlo simulated annealing was used to select molecular descriptors relevant to the discrimination of HERG potassium channel inhibitors. Three classification models with threshold values of IC50 =1.0, 10.0 μmol·L -1, respectively, were built using the support vector machine (SVM) approach. Models developed from 367 training set molecules were validated through 5-fold cross-validation (CV) and the average prediction accuracies were 84.8%-96.6%, 80.7%-97.7%, and 87.1%-97.2% for the positive, negative, and overall samples, respectively, which showed better performance than models previously reported in literature. Overall prediction accuracies for the three models using an external test set of 97 molecules were between 67.0% and 90.1%, which were close to or better than the results reported in literature.

Key words: Support vector machine, HEGR potassiumchannel inhibitor, Monte Carlo simulated annealing