|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Biological Sciences
2 Department of Computer Sciences
3 Markey Center for Structural Biology
4 The Bindley Bioscience Center, College of Science, Purdue University, West Lafayette, Indiana 47907, USA
(RECEIVED February 10, 2006; FINAL REVISION February 10, 2006; ACCEPTED February 12, 2006)
The impetus for the recent development and emergence of automated function prediction methods is an exponentially growing flood of new experimental data, the interpretation of which is hindered by a shortage of reliable annotations for proteins that lack experimental characterization or significant homologs in current databases. Here we introduce PFP, an automated function prediction server that provides the most probable annotations for a query sequence in each of the three branches of the Gene Ontology: biological process, molecular function, and cellular component. Rather than utilizing precise pattern matching to identify functional motifs in the sequences and structures of these proteins, we designed PFP to increase the coverage of function annotation by lowering resolution of predictions when a detailed function is not predictable. To do this we extend a traditional PSI-BLAST search by extracting and scoring annotations (GO terms) individually, including annotations from distantly related sequences, and applying a novel data mining tool, the Function Association Matrix, to score strongly associated pairs of annotations. We show that PFP can correctly assign function using only weakly similar sequences with a significantly better accuracy and coverage than a standard PSI-BLAST search, improving it more than fivefold. The most descriptive annotations predicted by PFP (GO depth
8) can identify a significant subgraph in the GO with >60% accuracy and
100% coverage for our benchmark set. We also provide examples of the superb performance of PFP in an assessment of automated function prediction servers at the Automated Function Prediction Special Interest Group meeting at ISMB 2005 (AFP-SIG '05).
Keywords: protein function prediction; PSI-BLAST; gene ontology; low-resolution function
This article has been cited by other articles:
![]() |
K. Forslund and E. L. L. Sonnhammer Predicting protein function from domain content Bioinformatics, August 1, 2008; 24(15): 1681 - 1687. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Gotz, J. M. Garcia-Gomez, J. Terol, T. D. Williams, S. H. Nagaraj, M. J. Nueda, M. Robles, M. Talon, J. Dopazo, and A. Conesa High-throughput functional annotation and data mining with the Blast2GO suite Nucleic Acids Res., June 1, 2008; 36(10): 3420 - 3435. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. N. Wass and M. J. E. Sternberg ConFunc--functional annotation in the twilight zone Bioinformatics, March 15, 2008; 24(6): 798 - 806. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lopez, A. Valencia, and M. L. Tress firestar--prediction of functionally important residues using structural templates and alignment reliability Nucleic Acids Res., July 13, 2007; 35(suppl_2): W573 - W577. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |