Please wait a minute...
Acta Phys. -Chim. Sin.  2017, Vol. 33 Issue (5): 918-926    DOI: 10.3866/PKU.WHXB201701163
Article     
Developing a Support Vector Machine Based QSPR Model to PredictGas-to-Benzene Solvation Enthalpy of Organic Compounds
GOLMOHAMMADI Hassan1, DASHTBOZORGI Zahra2, KHOOSHECHIN Sajad2
1 Young Researchers and Elite Club, Yadegar-e-Imam Khomeini (RAH) Shahr-e-Rey Branch, Islamic Azad University, Tehran, Iran;
2 Young Researchers and Elite Club, Central Tehran Branch, Islamic Azad University, Tehran, Iran
Download:   PDF(1991KB) Export: BibTeX | EndNote (RIS)       Supporting Info

Abstract  

The purpose of this paper is to present a novel way to building quantitative structure-propertyrelationship (QSPR) models for predicting the gas-to-benzene solvation enthalpy (ΔHSolv) of 158 organiccompounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptorswere calculated for each compounds using dragon package. The variable selection technique of enhancedreplacement method (ERM) was employed to select optimal subset of descriptors. Our investigation revealsthat the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact andthat ERM method is unable to model the solvation enthalpy accurately. The standard error value of predictionset for support vector machine (SVM) is 1.681 kJ·mol-1 while it is 4.624 kJ·mol-1 for ERM. The resultsestablished that the calculated ΔHSolv values by SVM were in good agreement with the experimental ones, andthe performances of the SVM models were superior to those obtained by ERM one. This indicates that SVMcan be used as an alternative modeling tool for QSPR studies.



Key wordsQuantitative structure-property relationship      Gas-to-benzene solvation enthalpy      Descriptor      Enhanced replacement method      Support vector machine     
Received: 13 December 2016      Published: 16 January 2017
Corresponding Authors: GOLMOHAMMADI Hassan     E-mail: hassan.gol@gmail.com
Cite this article:

GOLMOHAMMADI Hassan, DASHTBOZORGI Zahra, KHOOSHECHIN Sajad. Developing a Support Vector Machine Based QSPR Model to PredictGas-to-Benzene Solvation Enthalpy of Organic Compounds. Acta Phys. -Chim. Sin., 2017, 33(5): 918-926.

URL:

http://www.whxb.pku.edu.cn/Jwk_wk/wlhx/10.3866/PKU.WHXB201701163     OR     http://www.whxb.pku.edu.cn/Jwk_wk/wlhx/Y2017/V33/I5/918

(1) Duffy, E. M.; Jorgensen, W. L. J. Am. Chem. Soc. 2000, 122, 2878. doi: 10.1021/ja993663t
(2) Cornell, W. E.; Cieplak, P.; Bayly, C. I.; Merz, K. M.; Ferguson, D. M.; Spellmayer, D. C.; Fox, T.; Caldwell, J.W.; Kollman, P.A. J. Am. Chem. Soc. 1995, 117, 5179. doi: 10.1021/ja00124a002
(3) Graziano, G. Can. J. Chem. 2000, 78, 1233. doi: 10.1139/v00-125
(4) Graziano, G. Biophys. Chem. 1999, 82, 69. doi: 10.1016/S0301-4622(99)00063-0
(5) Graziano, G. J. Phys. Chem. B 2000, 104, 9249. doi: 10.1021/jp001461
(6) Garde, S.; Garcia, A. E.; Pratt, L. R.; Hummer, G. Biophys.Chem. 1999, 78, 21. doi: 10.1016/S0301-4622(99)00018-6
(7) Mintz, C.; Burton, K.; Acree, W. E., Jr.; Abraham, M. H. FluidPhase Equilibr. 2007, 258, 191. doi: 10.1016/j.fluid.2007.06.016
(8) Chickos, J. S.; Acr
(9) ee, W. E., Jr. J. Phys. Chem. Ref. Data 2003, 32, 519.doi: 10.1063/1.1529214
(10) Chickos, J. S.; Acree, W. E., Jr. J. Phys. Chem. Ref. Data 2002, 31, 537. doi: 10.1063/1.1475333
(11) Borges does Santos, R. M.; Muralha, V. S. F.; Correia, C. F.; Simões, J. A. M. J. Am. Chem. Soc. 2001, 123, 12670.doi: 10.1021/ja010703w
(12) Laarhoven, L. J. J.; Mulder, P.; Wayner, D. D. M. Acc. Chem.Res. 1999, 32, 342. doi: 10.1021/ar9703443
(13) Hansch, C.; Leo, A. Exploring QSAR: Fundamentals andApplications in Chemistry and Biology, American ChemicalSociety, Washington DC, 1995. doi: 10.1021/jm950902o
(14) Bao, L.; Sun, Z. R. FEBS Lett. 2002, 521, 109. doi: 10.1016/S0014-5793(02)02835-1
(15) Belousov, A. I.; Verzakov, S. A.; Von Frese, J. Chemom. Intell.Lab. Syst. 2002, 64, 15. doi: 10.1016/S0169-7439(02)00046-1
(16) Cai, Y. D.; Liu, X. J.; Xu, X. B.; Chou, K. C. Comput. Chem.2002, 26, 293. doi: 10.1016/S0097-8485(01)00113-9
(17) Morris, C.W.; Autret, A.; Boddy, L. Ecol. Model. 2001, 146, 57.doi: 10.1016/S0304-3800(01)00296-4
(18) Song, M. H.; Breneman, C. M.; Bi, J. B.; Sukumar, N.; Bennett, K. P.; Cramer, S.; Tugcu, N. J. Chem. Inf. Comput. Sci. 2002, 42, 1347. doi: 10.1021/ci025580t
(19) Liu, H. X.; Zhang, R. S.; Luan, F.; Yao, X. J.; Liu, M. C.; Hu, Z.D.; Fan, B. T. J. Chem. Inf. Comput. Sci. 2003, 43, 900.doi: 10.1021/ci0256438
(20) Liu, H. X.; Zhang, R. S.; Yao, X. J.; Liu, M. C.; Hu, Z. D.; Fan, B. T. J. Chem. Inf. Comput. Sci. 2003, 43, 1288. doi: 10.1021/ci0340355
(21) Golmohammadi, H.; Dashtbozorgi, Z.; Acree, W. E., Jr. Struct.Chem. 2013, 24, 1799. doi: 10.1007/s11224-013-0222-4
(22) Golmohammadi, H.; Dashtbozorgi, Z.; Acree, W. E., Jr. Phys.Chem. Liq. 2013, 51, 182. doi: 10.1080/00319104.2012.708932
(23) Dashtbozorgi, Z.; Golmohammadi, H.; Acree, W. E., Jr.Thermochim. Acta 2012, 539, 7. doi: 10.1016/j.tca.2012.03.017
(24) Golmohammadi, H.; Dashtbozorgi, Z.; Acree, W. E., Jr. Mol.Inf. 2012, 31, 867. doi: 10.1002/minf.201200091
(25) Dashtbozorgi, Z.; Golmohammadi, H.; Acree, W. E., Jr. Eur. J.Pharm. Sci. 2012, 47, 421. doi: 10.1016/j.ejps.2012.06.021
(26) Mintz, C.; Clark, M.; Burton, K.; Acree, W. E., Jr.; Abraham, M.H. QSAR Comb. Sci. 2007, 26, 881. doi: 10.1002/qsar.200630152
(27) Toubaei, A.; Golmohammadi, H.; Dashtbozorgi, Z.; Acree, W.E., Jr. J. Mol. Liq. 2012, 175, 24. doi: 10.1016/j.molliq.2012.08.006
(28) Todeschini, R.; Consonni, V. Molecular Descriptors forChemoinformatics.Wiley VCH:Weinheim, 2009. doi: 10.1002/9783527628766.ch22
(29) Hyperchem, re. 4. for Windows, Autodesk, Sansalito, CA, 1995.
(30) Todeschini, R.; Consonni, V.; Pavan, M. Dragon Software, Milano, 2002.
(31) Mercader, A. G.; Duchowicz, P. R.; Fernández, F. M.; Castro, E.A. J. Chem. Inf. Model. 2011, 51, 1575. doi: 10.1021/ci200079b
(32) MATLAB 7.0, The Mathworks Inc., Natick, MA, USA, 2005, http://www.mathworks.com.
(33) Baghban, A.; Ahmadi, M. A.; Pouladi, B.; Amanna, B.J. Supercrit. Fluids 2015, 101, 184. doi: 10.1016/j.supflu.2015.03.004
(34) Vapnik, V. N.; Lerner, A. Autom. Remote Control 1963, 24, 774.
(35) Vapnik, V. N.; Chervonenkis, A. Y. Autom. Remote Control 1964, 25, 821.
(36) Rojas, C.; Duchowicz, P. R.; Tripaldi, P.; Pis Diez, R.Chemometr. Intell. Lab. Syst. 2015, 140, 126. doi: 10.1016/j.chemolab.2014.09.020
(37) Mercader, G.; Duchowicz, P. R.; Fernández, F. M.; Castro, E. A.Chemometr. Intell. Lab. Syst. 2008, 92, 138. doi: 10.1016/j.chemolab.2008.02.005
(38) Gramatica, P. QSAR Comb. Sci. 2007, 26, 694. doi: 10.1002/qsar.200610151
(39) Cao, D. S.; Liang, Y. Z.; Xu, Q. S.; Li, H. D.; Chen, X.J. Comput. Chem. 2010, 31, 592. doi: 10.1002/jcc.21351
(40) Yan, J.; Huang, J. H.; He, M.; Lu, H. B.; Yang, R.; Kong, B.; Xu, Q. S.; Liang, Y. Z. J. Sep. Sci. 2013, 36, 2464. doi: 10.1002/jssc.201300254
(41) Cao, D. S.; Liang, Y. Z.; Xu, Q. S.; Yun, Y. H.; Li, H. D.J. Comput. Aided Mol. Des. 2011, 25, 67. doi: 10.1007/s10822-010-9401-1
(42) Eriksson, L.; Jaworska, J.; Worth, A. P.; Cronin, M. T.; McDowell, R. M.; Gramatica, P. Health Perspect. 2003, 111, 1361. doi: 10.1289/ehp.5758
(43) Golbraikh, A.; Shen, M.; Xiao, Z.; Xiao, Y.; Lee, K. H.; Tropsha, A. J. Comput. Aided Mol. Des. 2003, 17, 241.doi: 10.1023/A:1025386326946
(44) Golbraikh, A.; Tropsha, A. J. Mol. Graph. Model. 2002, 20, 269.doi: 10.1016/S1093-3263(01)00123-1
(45) Agrawal, V. K.; Khadikar, P.V. Bioorg. Med. Chem. 2001, 911, 3035. doi: 10.1016/S0968-0896(01)00211-5
(46) Pourbasheer, E.; Riahi, S.; Ganjali, M. R.; Norouzi, P.J. Enzyme. Inhib. Med. Chem. 2010, 256, 844. doi: 10.3109/14756361003757893
(47) Antipin, I. S.; Arslanov, N. A.; Palyulin, V. A.; Konovalov, A. I.; Zefirov, N. S. Dokl. Akad. Nauk. SSSR 1991, 316, 925.
(48) Sarkar, R.; Roy, A. B.; Sarkar, P. K. Math. Biosci. 1978, 39, 299.doi: 10.1016/0025-5564(78)90060-3
(49) Geary, R.C. Incorp. Statist. 1954, 5, 15. doi: 10.2307/2986645
(50) Moreau, G.; Broto, P. Nouv. J. Chim. 1980, 4, 757.
(51) Todeschini, R.; Consonni, V. Handbook of MolecularDescriptors, In: Methods and Principles in MedicinalChemistry; Mannhold, R., Kubinyi, H., Timmerman, H. Eds.; Wiley-VCH:Weinheim, 2000. doi: 10.1002/9783527613106
(52) Ma, S.; Lv, M.; Deng, F.; Zhang, X.; Zhai, H.; Lv, W. J. Hazard.Mater. 2015, 283, 591. doi: 10.1016/j.jhazmat.2014.10.011

[1] DING Xiaoqin, DING Junjie, LI Dayu, PAN Li, PEI Chengxin. Toxicity Prediction of Organoph Osphorus Chemical Reactivity Compounds Based on Conceptual DFT[J]. Acta Phys. -Chim. Sin., 2018, 34(3): 314-322.
[2] GHARA Manas, CHATTARAJ Pratim K. Bonding and Reactivity in RB-AsR Systems (R=H, F, OH, CH3, CMe3, CF3, SiF3, BO):Substituent Effects[J]. Acta Phys. -Chim. Sin., 2018, 34(2): 201-207.
[3] GOLMOHAMMADI Hassan, DASHTBOZORGI Zahra, KHOOSHECHIN Sajad. Prediction of Blood-to-Brain Barrier Partitioning of Drugs and Organic Compounds Using a QSPR Approach[J]. Acta Phys. -Chim. Sin., 2017, 33(6): 1160-1170.
[4] HE Bing, LUO Yong, LI Bing-Ke, XUE Ying, YU Luo-Ting, QIU Xiao-Long, YANG Teng-Kuei. Predicting and Virtually Screening Breast Cancer Targeting Protein HEC1 Inhibitors by Molecular Descriptors and Machine Learning Methods[J]. Acta Phys. -Chim. Sin., 2015, 31(9): 1795-1802.
[5] LIU Hai-Chun, LU Shuai, RAN Ting, ZHANG Yan-Min, XU Jin-Xing, XIONG Xiao, XU An-Yang, LU Tao, CHEN Ya-Dong. Accurate Activity Predictions of B-Raf Type II Inhibitors via Molecular Docking and QSAR Methods[J]. Acta Phys. -Chim. Sin., 2015, 31(11): 2191-2206.
[6] LIU Fen, ZOU Jian-Wei, HU Gui-Xiang, JIANG Yong-Jun. Quantitative Structure-Property Relationship Studies on the Adsorption of Aromatic Contaminants by Carbon Nanotubes[J]. Acta Phys. -Chim. Sin., 2014, 30(9): 1616-1624.
[7] SHI Jing-Jie, CHEN Li-Ping, CHEN Wang-Hua. QSPR Models of Compound Viscosity Based on Iterative Self-Organizing Data Analysis Technique and Ant Colony Algorithm[J]. Acta Phys. -Chim. Sin., 2014, 30(5): 803-810.
[8] FU Rong, LU Tian, CHEN Fei-Wu. Comparing Methods for Predicting the Reactive Site of Electrophilic Substitution[J]. Acta Phys. -Chim. Sin., 2014, 30(4): 628-639.
[9] LI Bing-Ke, CONG Yong, TIAN Zhi-Yue, XUE Ying. Predicting and Virtually Screening the Selective Inhibitors of MMP-13 over MMP-1 by Molecular Descriptors and Machine Learning Methods[J]. Acta Phys. -Chim. Sin., 2014, 30(1): 171-182.
[10] HAN Na, YUAN Zhe-Ming, CHEN Yuan, DAI Zhi-Jun, WANG Zhi-Ming. Prediction of HLA-A*0201 Restricted Cytotoxic T Lymphocyte Epitopes Based on High-Dimensional Descriptor Nonlinear Screening[J]. Acta Phys. -Chim. Sin., 2013, 29(09): 1945-1953.
[11] CONG Yong, XUE Ying. Quantitative Structure-Activity Relationship Study of the Non-Nucleoside Inhibitors of HCV NS5B Polymerase by Machine Learning Methods[J]. Acta Phys. -Chim. Sin., 2013, 29(08): 1639-1647.
[12] WANG Zhi-Ming, HAN Na, YUAN Zhe-Ming, WU Zhao-Hua. Feature Selection for High-Dimensional Data Based on Ridge Regression and SVM and Its Application in Peptide QSAR Modeling[J]. Acta Phys. -Chim. Sin., 2013, 29(03): 498-507.
[13] LÜ Wei, XUE Ying, MENG Qing-Wei. Classification Prediction of Inhibitors of H1N1 Neuraminidase by Machine Learning Methods[J]. Acta Phys. -Chim. Sin., 2013, 29(01): 217-223.
[14] SHI Jing-Jie, CHEN Li-Ping, CHEN Wang-Hua, SHI Ning, YANG Hui, XU Wei. Prediction of the Thermal Conductivity of Organic Compounds Using Heuristic and Support Vector Machine Methods[J]. Acta Phys. -Chim. Sin., 2012, 28(12): 2790-2796.
[15] ZHANG Qing-You, LONG Hai-Lin, FENG Xiu-Lin, SUO Jing-Jie, ZHANG Dan-Dan, LI Jing-Ya, XU Li-Zhuang, XU Lu. MOLMAP Descriptor and Its Application to Mutagenicity Prediction[J]. Acta Phys. -Chim. Sin., 2012, 28(03): 541-546.