Acta Phys. -Chim. Sin. ›› 2010, Vol. 26 ›› Issue (12): 3351-3359.doi: 10.3866/PKU.WHXB20101128

• BIOPHYSICAL CHEMISTRY • Previous Articles     Next Articles

Classification Models for Acetylcholinesterase Inhibitors Based on Machine Learning Methods

YANG Guo-Bing2, LI Ze-Rong1, RAO Han-Bing1, LI Xiang-Yuan2, CHEN Yu-Zong3   

  1. 1. College of Chemistry, Sichuan University, Chengdu 610064, P. R. China;
    2. College of Chemical Engineering, Sichuan University, Chengdu 610065, P. R. China;
    3. Department of Pharmacy, National University of Singapore, Singapore 117543
  • Received:2010-07-22 Revised:2010-08-15 Published:2010-12-01
  • Contact: LI Ze-Rong E-mail:lizerong@scu.edu.cn
  • Supported by:

    The project was supported by the National Natural Science Foundation of China (20973118).

Abstract:

A total of 1559 molecular descriptors including constitutional, charge distribution, topological, geometrical, and physicochemical descriptors were calculated to encode acetylcholinesterase inhibitors. The 37 molecular descriptors were selected using a hybrid filter/wrapper approach by combining a Fischer Score and Monte Carlo simulated annealing. Classification models for the acetylcholinesterase inhibitors were then built based on support vector machine (SVM), artificial neural networks (ANN), and k ?nearest neighbor (k?NN) methods. For the 515 samples in the training set, we obtained average prediction accuracies of 87.3%-92.7%, 67.0%-81.0%, and 79.4%-88.2% for the positive, the negative, and the total samples, respectively, by 5 ?fold cross validation. Average prediction accuracies of 72.7%-82.5%, 41.0%-53.0%, and 62.1%-69.1% were obtained for the positive, the negative, and the total samples, respectively, by the y?scrambling method, indicating that there was no chance correlation in our models. An external test was conducted on 172 samples that were not used for model building and we obtained prediction accuracies of 93.3%-100.0%, 74.6%-89.6%, and 86.1%-95.9% for the positive, the negative, and the total samples, respectively. The prediction accuracies obtained by all the machine learning methods especially by the SVM method were far better than previously reported results.

Key words: Acetylcholinesterase inhibitor, Machine learning method, Feature selection, Applicability domain

MSC2000: 

  • O641