Acta Phys. -Chim. Sin. ›› 2011, Vol. 27 ›› Issue (02): 343-351.doi: 10.3866/PKU.WHXB20110219


Support Vector Machine and KStar Models Predict the o-Dealkylation Reaction Mediated by Cytochrome P450

WANG Dan, ZHANG Yan-Ling, QIAO Yan-Jiang   

  1. School of Chinese Pharmacy, Beijing University of Chinese Medicine, Beijing 100102, P. R. China
  • Received:2010-09-13 Revised:2010-11-18 Published:2011-01-25
  • Contact: QIAO Yan-Jiang
  • Supported by:

    The project was supported by the Specialized Research Fund for the Doctoral Program of Higher Education, China (20092213120006) and Research Fund of State Administration of TCM of People’s Republic of China (200707010).


We constructed a nested prediction model based on support vector machines (SVM) and the KStar method. The models consisted of a molecular shape discriminative model for metabolites, which was used to predict the o-dealkalytion reaction mediated by cytochrome P450, in addition to the metabolic site discriminative model, which was used to judge C―O bond breaking in molecules. We calculated 1280 molecular descriptors including topological descriptors, 2D autocorrelation descriptors, and geometric descriptors to characterize the physicochemical properties of 272 molecules. A molecular shape discriminative model, represented by the classification models, was constructed by machine learning methods including SVM, decision tree, Bayesian network, and k nearest neighbors method. The results showed that the SVM model was superior to the other methods. Twenty-six quantum chemical features including charge-related, valency-related, and energy-related features were calculated for the 538 metabolism sites for the o-dealkylation reaction in the metabolic site discriminative model. Machine learning methods including decision tree, Bayesian network, KStar, and the artificial neural network method were also used to develop classification models. It showed that the KStar model with its prediction accuracy, sensitivity, and specificity of more than 90% outperformed the other classification models. Fifteen traditional Chinese medicine medicinal molecules were used to validate the model. The results showed that the nested models had a certain accuracy and could contribute to the prediction of metabolites from traditional Chinese medicines.

Key words: Support vector machine, Cytochrome P450 enzyme, KStar, o-Dealkalytion reaction


  • O641