Please wait a minute...
Acta Phys. -Chim. Sin.  2009, Vol. 25 Issue (08): 1587-1592    DOI: 10.3866/PKU.WHXB20090752
Article     
A Novel QSAR Model Based on Geostatistics and Support Vector Regression
CHEN Yuan, YUAN Zhe-Ming, ZHOU Wei, XIONG Xing-Yao
College of Bio-safety Science and Technology, Hunan AgriculturalUniversity, Changsha 410128, P. R. China|Hunan Provincial Key Laboratory of Crop GermplasmInnovation andUtilization, Hunan AgriculturalUniversity, Changsha 410128, P. R. China
Download:   PDF(694KB) Export: BibTeX | EndNote (RIS)      

Abstract  

Based on principal component analysis (PCA), geostatistics (GS) and support vector regression (SVR), a novel individual forecasting method for quantitative structure-activity relationship (QSAR)——Weight-PCA-GS-SVR was proposed. The basic principles were as follows: firstly, dimensions were reduced and redundant information from independent descriptors was eliminated using PCA; secondly, the principal components that have no relationship to activity were removed nonlinearly using SVR; thirdly, weighted distances between samples were calculated by the retained principal components; fourthly, a common range was confirmed using high-dimensional geostatistics; lastly, k nearest neighbors of each test sample were found from the training set with their weighted distances shorter than a common range and then the models were constructed and the individual prediction was found to be feasible using SVR. Weight-PCA-GS-SVR optimized the model along the column direction (descriptor) and row direction (sample), and had all the advantages of SVR. It therefore provides a newway to choose k nearest neighbors in the field as well as being a novel weighted method for determining the retained principal components or the retained descriptors. Predicted results from three data sets all verify that the novel method has the highest prediction precision among all reference models and has a remarkable advantage over reported results. Weight-PCA-GS-SVR, therefore, can be widely used in QSAR and other regression prediction fields.



Key wordsQuantitative structure-activity relationship      Geostatistics      Support vector regression      Principal component analysis      Individual prediction     
Received: 16 March 2009      Published: 26 May 2009
MSC2000:  O641  
Corresponding Authors: YUAN Zhe-Ming     E-mail: zhmyuan@sina. com
Cite this article:

CHEN Yuan, YUAN Zhe-Ming, ZHOU Wei, XIONG Xing-Yao. A Novel QSAR Model Based on Geostatistics and Support Vector Regression. Acta Phys. -Chim. Sin., 2009, 25(08): 1587-1592.

URL:

http://www.whxb.pku.edu.cn/10.3866/PKU.WHXB20090752     OR     http://www.whxb.pku.edu.cn/Y2009/V25/I08/1587

[1] Hassan GOLMOHAMMADI,Zahra DASHTBOZORGI,Sajad KHOOSHECHIN. Prediction of Blood-to-Brain Barrier Partitioning of Drugs and Organic Compounds Using a QSPR Approach[J]. Acta Phys. -Chim. Sin., 2017, 33(6): 1160-1170.
[2] Hai-Chun. LIU,Shuai. LU,Ting. RAN,Yan-Min. ZHANG,Jin-Xing. XU,Xiao. XIONG,An-Yang. XU,Tao. LU,Ya-Dong. CHEN. Accurate Activity Predictions of B-Raf Type II Inhibitors via Molecular Docking and QSAR Methods[J]. Acta Phys. -Chim. Sin., 2015, 31(11): 2191-2206.
[3] LI Yong, ZHOU Wei, DAI Zhi-Jun, CHEN Yuan, WANG Zhi-Ming, YUAN Zhe-Ming. Predicting the Protein Folding Rate Based on Sequence Feature Screening and Support Vector Regression[J]. Acta Phys. -Chim. Sin., 2014, 30(6): 1091-1098.
[4] HAN Na, YUAN Zhe-Ming, CHEN Yuan, DAI Zhi-Jun, WANG Zhi-Ming. Prediction of HLA-A*0201 Restricted Cytotoxic T Lymphocyte Epitopes Based on High-Dimensional Descriptor Nonlinear Screening[J]. Acta Phys. -Chim. Sin., 2013, 29(09): 1945-1953.
[5] SUN Sang-Dun, MI Si-Qi, YOU Jing, YU Ji-Liang, HU Song-Qing, LIU Xin-Yong. HQSAR Study and Molecular Design of Benzimidazole Derivatives as Corrosion Inhibitors[J]. Acta Phys. -Chim. Sin., 2013, 29(06): 1192-1200.
[6] WANG Zhi-Ming, HAN Na, YUAN Zhe-Ming, WU Zhao-Hua. Feature Selection for High-Dimensional Data Based on Ridge Regression and SVM and Its Application in Peptide QSAR Modeling[J]. Acta Phys. -Chim. Sin., 2013, 29(03): 498-507.
[7] KANG Cong-Min, ZHAO Xu-Hao, WANG Xin-Yu, CHENG Jia-Gao, LÜ Ying-Tao. QSAR and Molecular Docking on Five-Membered Heterocyclopyrimidines as Thymidylate Synthase Inhibitors[J]. Acta Phys. -Chim. Sin., 2013, 29(02): 431-438.
[8] TAO Wan-Jun, LI Chen-Wen, YIN Zong-Ning. Design of Self-Emulsifying System Based on QSAR[J]. Acta Phys. -Chim. Sin., 2011, 27(01): 71-77.
[9] MEI Hu, LIU Li, YANG Li, LI Jian, YAN Ning, WANG Qin. Prediction of Antitumor Activities of Indolo[1,2-b]Quinazoline Derivatives Using Electrotopological State Indices for AtomTypes[J]. Acta Phys. -Chim. Sin., 2009, 25(04): 747-751.
[10] TONG Jian-Bo;ZHOU Peng;ZHANG Sheng-Wan;LIANG Gui-Zhao;TIAN Fei-Fei;LI Mei-Ping;LI Sheng-Shi. QSAR Studies of Anti-HIV Drug HEPT Using 3D-HoVAIF[J]. Acta Phys. -Chim. Sin., 2006, 22(06): 721-725.
[11] ZHU Jun; NIU Yan; LÜ Wen; LEI Xiao-ping. Studies on Three-dimensional QSAR of Muscarinic Receptor Agonists[J]. Acta Phys. -Chim. Sin., 2005, 21(11): 1259-1263.
[12] WU Wen-Juan;LAI Rong;ZHENG Kang-Cheng;YUN Feng-Cun. Quantitative Structure-Activity Relationship of Indolo[1,2-b]quinazoline Derivatives with Antitumor Activity[J]. Acta Phys. -Chim. Sin., 2005, 21(01): 28-32.
[13] Wang Bao-Lei;Ma Ning;Wang Jian-Guo;Ma Yi;Li Zheng-Ming;Li Yong-Hong. 3D-QSAR Analysis of New Sulfonylureas Related to Their Herbicidal Activity[J]. Acta Phys. -Chim. Sin., 2004, 20(06): 577-581.
[14] Hu Gui-Xiang;Zou Jian-Wei;Jiang Yong-Jun;Wang Yan-Hua;Yu Qing-Sen. Predicting Human Intestinal Absorption from Threedimensional Molecular Structure of Drugs[J]. Acta Phys. -Chim. Sin., 2004, 20(05): 512-517.
[15] Jiang Yu-Ren;Liu Zhi-Guo;Liu Jing-Ya;Hu Yue-Hua;Wang Dian-Zuo. Application of a Novel Moledular Topological Index[J]. Acta Phys. -Chim. Sin., 2003, 19(03): 198-202.