Protein Science
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Published online before print January 4, 2005, 10.1110/ps.041035505
Protein Science (2005), 14:431-444. Published by Cold Spring Harbor Laboratory Press. Copyright © 2005 The Protein Society
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow All Versions of this Article:
ps.041035505v1
14/2/431    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Zhang, Z.
Right arrow Articles by Grigorov, M. G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Zhang, Z.
Right arrow Articles by Grigorov, M. G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Descriptor-based protein remote homology identification

Ziding Zhang, Sunil Kochhar and Martin G. Grigorov

Nestlé Research Center, BioAnalytical Science, CH-1000 Lausanne 26, Switzerland

(RECEIVED August 4, 2004; FINAL REVISION October 5, 2004; ACCEPTED October 14, 2004)

Here, we report a novel protein sequence descriptor-based remote homology identification method, able to infer fold relationships without the explicit knowledge of structure. In a first phase, we have individually benchmarked 13 different descriptor types in fold identification experiments in a highly diverse set of protein sequences. The relevant descriptors were related to the fold class membership by using simple similarity measures in the descriptor spaces, such as the cosine angle. Our results revealed that the three best-performing sets of descriptors were the sequence-alignment-based descriptor using PSI-BLAST e-values, the descriptors based on the alignment of secondary structural elements (SSEA), and the descriptors based on the occurrence of PROSITE functional motifs. In a second phase, the three top-performing descriptors were combined to obtain a final method with improved performance, which we named DescFold. Class membership was predicted by Support Vector Machine (SVM) learning. In comparison with the individual PSI-BLAST-based descriptor, the rate of remote homology identification increased from 33.7% to 46.3%. We found out that the composite set of descriptors was able to identify the true remote homolog for nearly every sixth sequence at the 95% confidence level, or some 10% more than a single PSI-BLAST search. We have benchmarked the DescFold method against several other state-of-the-art fold recognition algorithms for the 172 LiveBench-8 targets, and we concluded that it was able to add value to the existing techniques by providing a confident hit for at least 10% of the sequences not identifiable by the previously known methods.

Keywords: remote homology; sequence descriptor; secondary structure; sequence alignment; sequence motif; support vector machine

Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.041035505.


Reprint requests to: Ziding Zhang, Nestlé Research Center, BioAnalyti-cal Science, CH-1000 Lausanne 26, Switzerland; e-mail: Ziding. Zhang{at}rdls.nestle.com; fax: +41-21-785-9486.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Protein Eng Des SelHome page
Y.-R. Tang, Y.-Z. Chen, C. A. Canchaya, and Z. Zhang
GANNPhos: a new phosphorylation site predictor based on a genetic algorithm integrated neural network
Protein Eng. Des. Sel., August 1, 2007; 20(8): 405 - 412.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2005 by The Protein Society.