Protein Science
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by WU, C.
Right arrow Articles by CHANG, T. C.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by WU, C.
Right arrow Articles by CHANG, T. C.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Protein Science, Vol 1, Issue 5 667-677, Copyright © 1992 by Cold Spring Harbor Laboratory Press


ARTICLE

Protein classification artificial neural system

C. WU, G. WHITSON, J. MCLARTY, A. ERMONGKONCHAI and T. C. CHANG
Department of Computer Science, The University of Texas at Tyler, Tyler, Texas 75701 Department of Epidemiology/Biomathematics, The University of Texas Health Center at Tyler, Tyler, Texas 75710

A neural network classification method is developed as an alternative approach to the large database search/ organization problem. The system, termed Protein Classification Artificial Neural System (ProCANS), has been implemented on a Cray supercomputer for rapid superfamily classification of unknown proteins based on the information content of the neural interconnections. The system employs an n-gram hashing function that is similar to the k-tuple method for sequence encoding. A collection of modular back-propagation networks is used to store the large amount of sequence patterns. The system has been trained and tested with the first 2,148 of the 8,309 entries of the annotated Protein Identification Resource protein sequence database (release 29). The entries included the electron transfer proteins and the six enzyme groups (oxidoreductases, transferases, hydrolases, lyases, isomerases, and ligases), with a total of 620 superfamilies. After a total training time of seven Cray central processing unit (CPU) hours, the system has reached a predictive accuracy of 90%. The classification is fast (i.e., 0.1 Cray CPU second per sequence), as it only involves a forward-feeding through the networks. The classification time on a full-scale system embedded with all known superfamilies is estimated to be within 1 CPU second. Although the training time will grow linearly with the number of entries, the classification time is expected to remain low even if there is a 10-100-fold increase of sequence entries. The neural database, which consists of a set of weight matrices of the networks, together with the ProCANS software, can be ported to other computers and made available to the genome community. The rapid and accurate superfamily classification would be valuable to the organization of protein sequence databases and to the gene recognition in large sequencing projects.
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Protein Sci.Home page
C.-S. Yu, C.-J. Lin, and J.-K. Hwang
Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n-peptide compositions
Protein Sci., May 1, 2004; 13(5): 1402 - 1406.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
B. A. Galitsky, I. M. Gelfand, and A. E. Kister
Predicting amino acid sequences of the antibody human VH chains from its first several residues
PNAS, April 28, 1998; 95(9): 5193 - 5198.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 1992 by The Protein Society.