Protein Science CSH PROT
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online before print December 22, 2006, 10.1110/ps.062523907
Protein Science (2007), 16:216-226. Published by Cold Spring Harbor Laboratory Press. Copyright © 2007 The Protein Society
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
ps.062523907v1
16/2/216    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Youn, E.
Right arrow Articles by Mooney, S. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Youn, E.
Right arrow Articles by Mooney, S. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Evaluation of features for catalytic residue prediction in novel folds

Eunseog Youn1, Brandon Peters1, Predrag Radivojac2, and Sean D. Mooney1

1 Center for Computational Biology and Bioinformatics, Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana 46202, USA
2 School of Informatics, Indiana University, Bloomington, Indiana 47408, USA

(RECEIVED August 25, 2006; FINAL REVISION November 8, 2006; ACCEPTED November 10, 2006)

Structural genomics projects are determining the three-dimensional structure of proteins without full characterization of their function. A critical part of the annotation process involves appropriate knowledge representation and prediction of functionally important residue environments. We have developed a method to extract features from sequence, sequence alignments, three-dimensional structure, and structural environment conservation, and used support vector machines to annotate homologous and nonhomologous residue positions based on a specific training set of residue functions. In order to evaluate this pipeline for automated protein annotation, we applied it to the challenging problem of prediction of catalytic residues in enzymes. We also ranked the features based on their ability to discriminate catalytic from noncatalytic residues. When applying our method to a well-annotated set of protein structures, we found that top-ranked features were a measure of sequence conservation, a measure of structural conservation, a degree of uniqueness of a residue's structural environment, solvent accessibility, and residue hydrophobicity. We also found that features based on structural conservation were complementary to those based on sequence conservation and that they were capable of increasing predictor performance. Using a family nonredundant version of the ASTRAL 40 v1.65 data set, we estimated that the true catalytic residues were correctly predicted in 57.0% of the cases, with a precision of 18.5%. When testing on proteins containing novel folds not used in training, the best features were highly correlated with the training on families, thus validating the approach to nonhomologous catalytic residue prediction in general. We then applied the method to 2781 coordinate files from the structural genomics target pipeline and identified both highly ranked and highly clustered groups of predicted catalytic residues.

Keywords: catalytic residue prediction; structural environment conservation; feature evaluation



Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Protein Eng Des SelHome page
Y.-R. Tang, Z.-Y. Sheng, Y.-Z. Chen, and Z. Zhang
An improved prediction of catalytic residues in enzyme structures
Protein Eng. Des. Sel., May 1, 2008; 21(5): 295 - 302.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. D. Fischer, C. E. Mayer, and J. Soding
Prediction of protein functional residues from sequence by probability density estimation
Bioinformatics, March 1, 2008; 24(5): 613 - 620.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
W. Tong, R. J. Williams, Y. Wei, L. F. Murga, J. Ko, and M. J. Ondrechen
Enhanced performance in prediction of protein active sites with THEMATICS and support vector machines
Protein Sci., February 1, 2008; 17(2): 333 - 341.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2007 by The Protein Society.