Protein Science
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Protein Science (2004), 13:2857-2863. Published by Cold Spring Harbor Laboratory Press. Copyright © 2004 The Protein Society
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chou, K.-C.
Right arrow Articles by Cai, Y.-D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chou, K.-C.
Right arrow Articles by Cai, Y.-D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Predicting enzyme family class in a hybridization space

Kuo-Chen Chou1,2 and Yu-Dong Cai1,3

1 Gordon Life Science Institute, San Diego, California 92130, USA
2 Tianjin Institute of Bioinformatics and Drug Discovery (TIBDD), Tianjin, China
3 Biomolecular Sciences Department, University of Manchester Institute of Science and Technology (UMIST), Manchester, M60 1QD, United Kingdom

(RECEIVED July 9, 2004; FINAL REVISION August 10, 2004; ACCEPTED August 11, 2004)

Given the sequence of a protein, how can we predict whether it is an enzyme or a non-enzyme? If it is, what enzyme family class it belongs to? Because these questions are closely relevant to the biological function of a protein and its acting object, their importance is self-evident. Particularly with the explosion of protein sequences entering into data banks and the relatively much slower progress in using biochemical experiments to determine their functions, it is highly desired to develop an automated method that can be used to give fast answers to these questions. By hybridizing the gene ontology and pseudo-amino-acid composition, we have introduced a new method that is called GO-PseAA predictor and operate it in a hybridization space. To avoid redundancy and bias, demonstrations were performed on a data set in which none of the proteins in an individual class has ≥40% sequence identity to any other. The overall success rate thus obtained by the jackknife cross-validation test in identifying enzyme and non-enzyme was 93%, and that in identifying the enzyme family was 94% for the following six main Enzyme Commission (EC) classes: (1) oxidoreductase, (2) transferase, (3) hydrolase, (4) lyase, (5) isomerase, and (6) ligase. The corresponding rates by the independent data set test were 98% and 97%, respectively.

Keywords: ENZYME database; 40% cutoff; Gene Ontology; pseudo-amino-acid composition; quasi-sequence-order effect; ISort predictor; GO-PseAA predictor; bioinformatics; proteomics


Reprint requests to: Kuo-Chen Chou, Gordon Life Science Institute, San Diego, CA 92130, USA; e-mail: kchou{at}san.rr.com; fax: (858) 484-1018.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Protein Eng Des SelHome page
Q.-B. Gao and Z.-Z. Wang
Classification of G-protein coupled receptors at four levels
Protein Eng. Des. Sel., November 1, 2006; 19(11): 511 - 516.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2004 by The Protein Society.