物理化学学报 >> 2017, Vol. 33 >> Issue (5): 918-926.doi: 10.3866/PKU.WHXB201701163

论文 上一篇    下一篇

Developing a Support Vector Machine Based QSPR Model to PredictGas-to-Benzene Solvation Enthalpy of Organic Compounds

GOLMOHAMMADI Hassan1,*(),DASHTBOZORGI Zahra2,KHOOSHECHIN Sajad2   

  1. 1 Young Researchers and Elite Club, Yadegar-e-Imam Khomeini (RAH) Shahr-e-Rey Branch, Islamic Azad University, Tehran, Iran
    2 Young Researchers and Elite Club, Central Tehran Branch, Islamic Azad University, Tehran, Iran
  • 收稿日期:2016-12-13 发布日期:2017-04-20
  • 通讯作者: GOLMOHAMMADI Hassan E-mail:hassan.gol@gmail.com

Developing a Support Vector Machine Based QSPR Model to PredictGas-to-Benzene Solvation Enthalpy of Organic Compounds

Hassan GOLMOHAMMADI1,*(),Zahra DASHTBOZORGI2,Sajad KHOOSHECHIN2   

  1. 1 Young Researchers and Elite Club, Yadegar-e-Imam Khomeini (RAH) Shahr-e-Rey Branch, Islamic Azad University, Tehran, Iran
    2 Young Researchers and Elite Club, Central Tehran Branch, Islamic Azad University, Tehran, Iran
  • Received:2016-12-13 Published:2017-04-20
  • Contact: Hassan GOLMOHAMMADI E-mail:hassan.gol@gmail.com

摘要:

The purpose of this paper is to present a novel way to building quantitative structure-propertyrelationship (QSPR) models for predicting the gas-to-benzene solvation enthalpy (ΔHSolv) of 158 organiccompounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptorswere calculated for each compounds using dragon package. The variable selection technique of enhancedreplacement method (ERM) was employed to select optimal subset of descriptors. Our investigation revealsthat the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact andthat ERM method is unable to model the solvation enthalpy accurately. The standard error value of predictionset for support vector machine (SVM) is 1.681 kJ·mol-1 while it is 4.624 kJ·mol-1 for ERM. The resultsestablished that the calculated ΔHSolv values by SVM were in good agreement with the experimental ones, andthe performances of the SVM models were superior to those obtained by ERM one. This indicates that SVMcan be used as an alternative modeling tool for QSPR studies.

关键词: Quantitative structure-property relationship, Gas-to-benzene solvation enthalpy, Descriptor, Enhanced replacement method, Support vector machine

Abstract:

The purpose of this paper is to present a novel way to building quantitative structure-propertyrelationship (QSPR) models for predicting the gas-to-benzene solvation enthalpy (ΔHSolv) of 158 organiccompounds based on molecular descriptors calculated from the structure alone. Different kinds of descriptorswere calculated for each compounds using dragon package. The variable selection technique of enhancedreplacement method (ERM) was employed to select optimal subset of descriptors. Our investigation revealsthat the dependence of physico-chemical properties on solvation enthalpy is a nonlinear observable fact andthat ERM method is unable to model the solvation enthalpy accurately. The standard error value of predictionset for support vector machine (SVM) is 1.681 kJ·mol-1 while it is 4.624 kJ·mol-1 for ERM. The resultsestablished that the calculated ΔHSolv values by SVM were in good agreement with the experimental ones, andthe performances of the SVM models were superior to those obtained by ERM one. This indicates that SVMcan be used as an alternative modeling tool for QSPR studies.

Key words: Quantitative structure-property relationship, Gas-to-benzene solvation enthalpy, Descriptor, Enhanced replacement method, Support vector machine