物理化学学报 >> 2015, Vol. 31 >> Issue (9): 1795-1802.doi: 10.3866/PKU.WHXB201507301

生物物理化学 上一篇    下一篇

基于分子描述符和机器学习方法预测和虚拟筛选乳腺癌靶向蛋白HEC1抑制剂

何冰1,2,罗勇1,李秉轲2,薛英1,3,余洛汀1(),邱小龙4,5,杨登贵4   

  1. 1 四川大学华西医院生物治疗国家重点实验室,肿瘤中心及生物治疗协同创新中心,成都610041
    2 成都师范学院化学与生命科学学院,成都611130
    3 四川大学化学学院,成都610064
    4 江苏兆邦生物医药研究院有限公司,江苏南通226000
    5 江苏海门慧聚药业有限公司,江苏海门226123
  • 收稿日期:2015-04-02 发布日期:2015-09-06

Predicting and Virtually Screening Breast Cancer Targeting Protein HEC1 Inhibitors by Molecular Descriptors and Machine Learning Methods

Bing. HE1,2,Yong. LUO1,Bing-Ke. LI2,Ying. XUE1,3,Luo-Ting. YU1(),Xiao-Long. QIU4,5,Teng-Kuei. YANG4   

  1. 1 State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, and Collaborative Innovation Center for Biotherapy, Chengdu 610041, P. R. China
    2 College of Chemistry and Life Science, Chengdu Normal University, Chengdu 611130, P. R. China
    3 College of Chemistry, Sichuan University, Chengdu 610064, P. R. China
    4 Zhaobang Bio-Med. Institute Co., Ltd., Nantong 226000, Jiangsu Province, P. R. China
    5 Wisdom Pharmaceutical Co., Ltd., Haimen 226123, Jiangsu Province, P. R. China
  • Received:2015-04-02 Published:2015-09-06

摘要:

HEC1(癌症高表达蛋白)是纺锤体检查点控制、着丝粒功能、细胞存活的关键的有丝分裂调节器,与原发性乳腺癌的不良预后有关.筛选具有高亲和力的HEC1新型抑制剂对探索乳腺癌的靶向治疗具有重要意义.本文从结构多样性的化合物库中筛选HEC1抑制剂.通过对分子描述符的特征筛选,采用支持向量机(SVM)和随机森林(RF)方法分别对HEC1抑制剂和非抑制剂建立了分类模型.经对比, RF模型显示了更好的预测精度.我们采用RF模型对HEC1抑制剂进行了虚拟筛选,从“in-house”实体库筛选得到2个潜在的HEC1抑制剂分子.随后对筛出的化合物进行了体外活性实验,发现对乳腺癌细胞株MDA-MB-468和MDA-MB-231均有一定程度的抗肿瘤活性.研究结果表明,机器学习方法对于设计和虚拟筛选HEC1抑制剂有良好的效果.

关键词: HEC1, 选择性抑制剂, 机器学习方法, 支持向量机, 随机森林, 虚拟筛选

Abstract:

Highly expressed in cancer 1 (HEC1) is a conserved mitotic regulator that is critical for spindle checkpoint control, kinetochore functionality, and cell survival. Overexpression of HEC1 has been detected in a variety of human cancers, and it is linked to poor prognosis of primary breast cancers. Thus, it is important to screen novel inhibitors with high affinity for HEC1. Machine learning (ML) methods were exhibiting good pharmacodynamics, and toxicity. In this work, two ML methods, support vector machines (SVMs) and random forests (RFs), were used to develop a classification method for searching inhibitors and non-inhibitors of HEC1 from the chemical library of structural diversity by screening characteristics of molecular descriptors. Both ML methods achieved promising prediction accuracies, and the RF model showed better performance. We performed virtual screening of HEC1 inhibitors by the RF model from an in-house database to screen potential HEC1 inhibitors. Two novel potential candidates were found. In vitro experiments of the two compounds showed that both had a certain degree of antitumor activity for the MDA-MB-468 and MDA-MB-231 breast cancer cell lines. Our study shows that ML methods are promising to design and virtually screen inhibitors of HEC1.

Key words: HEC1, Selective inhibitor, Machine learning method, Support vector machine, Random forest, Virtual screening