Acta Phys. -Chim. Sin. ›› 2012, Vol. 28 ›› Issue (12): 2790-2796.doi: 10.3866/PKU.WHXB201209273

• THEORETICAL AND COMPUTATIONAL CHEMISTRY • Previous Articles     Next Articles

Prediction of the Thermal Conductivity of Organic Compounds Using Heuristic and Support Vector Machine Methods

SHI Jing-Jie1,2, CHEN Li-Ping1, CHEN Wang-Hua1, SHI Ning2, YANG Hui1, XU Wei2   

  1. 1 Department of Safety Engineering, School of Chemical Engineering, Nanjing University of Science & Technology, Nanjing 210094, P R China;
    2 State Key Laboratory of Chemical Safety and Control, Qingdao 266071, Shandong Province, P R China
  • Received:2012-07-16 Revised:2012-09-10 Published:2012-11-14
  • Supported by:

    The project was supported by the National Key Basic Research Program of China (973) (2010CB735510).

Abstract:

To build the quantitative structure-property relationship (QSPR) between the molecular structures and the thermal conductivities of 147 organic compounds and investigate which structural factors influence the thermal conductivity of organic molecules, the topological, constitutional, geometrical, electrostatic, quantum-chemical, and thermodynamic descriptors of the compounds were calculated using the CODESSA software package, where these descriptors were pre-selected by the heuristic method (HM). The dataset of 147 organic compounds was randomly divided into a training set (118), and a test set (29). As a result, a five-descriptor linear model was constructed to describe the relationship between the molecular structures and the thermal conductivities. In addition, a non-linear regression model was built based on the support vector machine (SVM) with the same five descriptors. It was concluded that, although the fitting performance of the SVM model (squared correlation coefficient, R2=0.9240) was slightly worse than that of the HM model (R2=0.9267), the predictive performance of the SVM model (R2=0.9682) was better than that of the HM model (R2=0.9574). As the predictive parameter is more important than the fitting parameter, it can be seen that the SVM model is superior to the HM model. The proposed methods (SVM and HM) can be successfully used to predict the thermal conductivity of organic compounds with pre-selected theoretical descriptors, which can be directly calculated solely from the molecular structure.

Key words: Heuristic method, Support vector machine, Thermal conductivity, Prediction, QSPR