Protein Science
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Protein Science (2004), 13:1612-1626. Published by Cold Spring Harbor Laboratory Press. Copyright © 2004 The Protein Society
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Wang, G.
Right arrow Articles by Dunbrack, R. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Wang, G.
Right arrow Articles by Dunbrack, R. L., Jr.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Scoring profile-to-profile sequence alignments

Guoli Wang and Roland L. Dunbrack, Jr.

Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania 19111, USA

(RECEIVED December 26, 2003; FINAL REVISION March 12, 2004; ACCEPTED March 16, 2004)



Abstract

Sequence alignment profiles have been shown to be very powerful in creating accurate sequence alignments. Profiles are often used to search a sequence database with a local alignment algorithm. More accurate and longer alignments have been obtained with profile-to-profile comparison. There are several steps that must be performed in creating profile–profile alignments, and each involves choices in parameters and algorithms. These steps include (1) what sequences to include in a multiple alignment used to build each profile, (2) how to weight similar sequences in the multiple alignment and how to determine amino acid frequencies from the weighted alignment, (3) how to score a column from one profile aligned to a column of the other profile, (4) how to score gaps in the profile–profile alignment, and (5) how to include structural information. Large-scale benchmarks consisting of pairs of homologous proteins with structurally determined sequence alignments are necessary for evaluating the efficacy of each scoring scheme. With such a benchmark, we have investigated the properties of profile–profile alignments and found that (1) with optimized gap penalties, most column–column scoring functions behave similarly to one another in alignment accuracy; (2) some functions, however, have much higher search sensitivity and specificity; (3) position-specific weighting schemes in determining amino acid counts in columns of multiple sequence alignments are better than sequence-specific schemes; (4) removing positions in the profile with gaps in the query sequence results in better alignments; and (5) adding predicted and known secondary structure information improves alignments.

Keywords: sequence profiles; profile–profile alignment; PSI-BLAST


Reprint requests to: Roland L. Dunbrack Jr., Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA; e-mail: RL_Dunbrack{at}fccc.edu; fax: (215) 728-2412.

Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.03601504.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Protein Sci.Home page
P. Larsson, B. Wallner, E. Lindahl, and A. Elofsson
Using multiple templates to improve quality of homology models in automated homology modeling
Protein Sci., June 1, 2008; 17(6): 990 - 1002.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. I. Sadreyev and N. V. Grishin
Accurate statistical model of comparison between multiple sequence alignments
Nucleic Acids Res., April 1, 2008; 36(7): 2240 - 2248.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Kolinski and D. Gront
Comparative modeling without implicit sequence alignments
Bioinformatics, October 1, 2007; 23(19): 2522 - 2527.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. I. Sadreyev, M. Tang, B.-H. Kim, and N. V. Grishin
COMPASS server for remote homology inference
Nucleic Acids Res., July 13, 2007; 35(suppl_2): W653 - W658.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. S. Papadopoulos and R. Agarwala
COBALT: constraint-based alignment tool for multiple protein sequences
Bioinformatics, May 1, 2007; 23(9): 1073 - 1079.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Pei and N. V. Grishin
PROMALS: towards accurate multiple sequence alignments of distantly related proteins
Bioinformatics, April 1, 2007; 23(7): 802 - 808.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Rangwala and G. Karypis
Incremental window-based protein sequence alignment algorithms
Bioinformatics, January 15, 2007; 23(2): e17 - e23.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Cheng and P. Baldi
A machine learning information retrieval approach to protein fold recognition
Bioinformatics, June 15, 2006; 22(12): 1456 - 1463.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Physiol.Home page
A. A. Fodor and R. W. Aldrich
Statistical Limits to the Identification of Ion Channel Domains by Sequence Similarity
J. Gen. Physiol., May 30, 2006; 127(6): 755 - 766.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
B. Wallner and A. Elofsson
Identification of correct regions in protein models using structural, alignment, and consensus information
Protein Sci., April 1, 2006; 15(4): 900 - 913.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Rangwala and G. Karypis
Profile-based direct kernels for remote homology detection and fold recognition
Bioinformatics, December 1, 2005; 21(23): 4239 - 4247.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
O. Schueler-Furman, C. Wang, P. Bradley, K. Misura, and D. Baker
Progress in Modeling of Protein Structures and Interactions
Science, October 28, 2005; 310(5748): 638 - 642.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
H. Zhou and Y. Zhou
SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures
Bioinformatics, September 15, 2005; 21(18): 3615 - 3621.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
L. Jaroszewski, L. Rychlewski, Z. Li, W. Li, and A. Godzik
FFAS03: a server for profile-profile sequence alignments
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W284 - W288.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. A. Simossis and J. Heringa
PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W289 - W294.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Frenkel-Morgenstern, H. Voet, and S. Pietrokovski
Enhanced statistics for local alignment of multiple alignments improves prediction of protein function and structure
Bioinformatics, July 1, 2005; 21(13): 2950 - 2956.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Y. Kahsay, G. Wang, G. Gao, L. Liao, and R. Dunbrack
Quasi-consensus-based comparison of profile hidden Markov models for protein sequences
Bioinformatics, May 15, 2005; 21(10): 2287 - 2293.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Soding
Protein homology detection by HMM-HMM comparison
Bioinformatics, April 1, 2005; 21(7): 951 - 960.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. A. Simossis, J. Kleinjung, and J. Heringa
Homology-extended sequence alignment
Nucleic Acids Res., February 7, 2005; 33(3): 816 - 824.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2004 by The Protein Society.