Acta Phys. -Chim. Sin. ›› 2009, Vol. 25 ›› Issue (12): 2558-2564.doi: 10.3866/PKU.WHXB20091122
• ARTICLE •
LIU Yue, LI Xiao-Qin, XU Hai-Song, QIAO Hui
The mechanism of how protein amino acid sequences determine protein structure is a core issue in biology. The protein fold type reflects the topological pattern of the structure's core. Fold recognition is an important method in protein sequence-structure research. This article focuses on the 36 fold types that are not incorporated into the unified hidden Markov model (HMM) model but that account for 41.8% of α, β, and α/β protein's in the Astral 1.65 sequence database. The training set contains samples that have less than 25% sequence identity with each other. We applied the hierarchical clustering method according to root mean square deviation (RMSD) and fold subgroups were generated. A profile-HMM based on a multiple structural alignment algorithm (MUSTANG) structure alignment was then built for each subgroup. After testing 9505 proteins with less than 95% sequence identity from the Astral 1.65 database, the average sensitivity, specificity and Matthew's correlation coefficient (MCC) of the 36 fold types were found to be 90%, 99% and 0.95, respectively. These results show that classification modeling according to RMSD is able to achieve precise fold recognition while a unified HMM cannot be built because there are too many elements in the training set. We have developed a new method and novel ideas to enable profile-HMMprotein fold recognition and have laid the foundation for further research.
Protein fold type,
LIU Yue, LI Xiao-Qin, XU Hai-Song, QIAO Hui. Classification Modeling and Recognition of Protein Fold Type[J]. Acta Phys. -Chim. Sin. 2009, 25(12), 2558-2564. doi: 10.3866/PKU.WHXB20091122
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks