Protein Science
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online before print May 2, 2006, 10.1110/ps.062153506
Protein Science (2006), 15:1550-1556. Published by Cold Spring Harbor Laboratory Press. Copyright © 2006 The Protein Society
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
ps.062153506v1
15/6/1550    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Hawkins, T.
Right arrow Articles by Kihara, D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Hawkins, T.
Right arrow Articles by Kihara, D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

AUTOMATED FUNCTION PREDICTION

Enhanced automated function prediction using distantly related sequences and contextual association by PFP

Troy Hawkins1, Stanislav Luban1,2 and Daisuke Kihara1,2,3,4

1 Department of Biological Sciences
2 Department of Computer Sciences
3 Markey Center for Structural Biology
4 The Bindley Bioscience Center, College of Science, Purdue University, West Lafayette, Indiana 47907, USA

(RECEIVED February 10, 2006; FINAL REVISION February 10, 2006; ACCEPTED February 12, 2006)

The impetus for the recent development and emergence of automated function prediction methods is an exponentially growing flood of new experimental data, the interpretation of which is hindered by a shortage of reliable annotations for proteins that lack experimental characterization or significant homologs in current databases. Here we introduce PFP, an automated function prediction server that provides the most probable annotations for a query sequence in each of the three branches of the Gene Ontology: biological process, molecular function, and cellular component. Rather than utilizing precise pattern matching to identify functional motifs in the sequences and structures of these proteins, we designed PFP to increase the coverage of function annotation by lowering resolution of predictions when a detailed function is not predictable. To do this we extend a traditional PSI-BLAST search by extracting and scoring annotations (GO terms) individually, including annotations from distantly related sequences, and applying a novel data mining tool, the Function Association Matrix, to score strongly associated pairs of annotations. We show that PFP can correctly assign function using only weakly similar sequences with a significantly better accuracy and coverage than a standard PSI-BLAST search, improving it more than fivefold. The most descriptive annotations predicted by PFP (GO depth ≥8) can identify a significant subgraph in the GO with >60% accuracy and ~100% coverage for our benchmark set. We also provide examples of the superb performance of PFP in an assessment of automated function prediction servers at the Automated Function Prediction Special Interest Group meeting at ISMB 2005 (AFP-SIG '05).

Keywords: protein function prediction; PSI-BLAST; gene ontology; low-resolution function



Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
K. Forslund and E. L. L. Sonnhammer
Predicting protein function from domain content
Bioinformatics, August 1, 2008; 24(15): 1681 - 1687.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Gotz, J. M. Garcia-Gomez, J. Terol, T. D. Williams, S. H. Nagaraj, M. J. Nueda, M. Robles, M. Talon, J. Dopazo, and A. Conesa
High-throughput functional annotation and data mining with the Blast2GO suite
Nucleic Acids Res., June 1, 2008; 36(10): 3420 - 3435.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. N. Wass and M. J. E. Sternberg
ConFunc--functional annotation in the twilight zone
Bioinformatics, March 15, 2008; 24(6): 798 - 806.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Lopez, A. Valencia, and M. L. Tress
firestar--prediction of functionally important residues using structural templates and alignment reliability
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W573 - W577.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2006 by The Protein Society.