|
|
||||||||
Protein Science, Vol 9, Issue 2 232-241, Copyright © 2000 by The Protein Society
JOURNAL ARTICLE |
L Rychlewski, L Jaroszewski, W Li and A Godzik
San Diego Supercomputer Center, La Jolla, California 92093, USA.
Distant homologies between proteins are often discovered only after three-dimensional structures of both proteins are solved. The sequence divergence for such proteins can be so large that simple comparison of their sequences fails to identify any similarity. New generation of sensitive alignment tools use averaged sequences of entire homologous families (profiles) to detect such homologies. Several algorithms, including the newest generation of BLAST algorithms and BASIC, an algorithm used in our group to assign fold predictions for proteins from several genomes, are compared to each other on the large set of structurally similar proteins with little sequence similarity. Proteins in the benchmark are classified according to the level of their similarity, which allows us to demonstrate that most of the improvement of the new algorithms is achieved for proteins with strong functional similarities, with almost no progress in recognizing distant fold similarities. It is also shown that details of profile calculation strongly influence its sensitivity in recognizing distant homologies. The most important choice is how to include information from diverging members of the family, avoiding generating false predictions, while accounting for entire sequence divergence within a family. PSI-BLAST takes a conservative approach, deriving a profile from core members of the family, providing a solid improvement without almost any false predictions. BASIC strives for better sensitivity by increasing the weight of divergent family members and paying the price in lower reliability. A new FFAS algorithm introduced here uses a new procedure for profile generation that takes into account all the relations within the family and matches BASIC sensitivity with PSI-BLAST like reliability.
This article has been cited by other articles:
![]() |
S. Tarighi, Q. Wei, M. Camara, P. Williams, M. P. Fletcher, T. Kajander, and P. Cornelis The PA4204 gene encodes a periplasmic gluconolactonase (PpgL) which is important for fitness of Pseudomonas aeruginosa Microbiology, October 1, 2008; 154(10): 2979 - 2990. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Larsson, B. Wallner, E. Lindahl, and A. Elofsson Using multiple templates to improve quality of homology models in automated homology modeling Protein Sci., June 1, 2008; 17(6): 990 - 1002. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. I. Sadreyev and N. V. Grishin Accurate statistical model of comparison between multiple sequence alignments Nucleic Acids Res., April 1, 2008; 36(7): 2240 - 2248. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Kerk, G. Templeton, and G. B.G. Moorhead Evolutionary Radiation Pattern of Novel Protein Phosphatases Revealed by Analysis of Protein Data from the Completely Sequenced Genomes of Humans, Green Algae, and Higher Plants Plant Physiology, February 1, 2008; 146(2): 351 - 367. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Zhou and J. Skolnick Ab Initio Protein Structure Prediction Using Chunk-TASSER Biophys. J., September 1, 2007; 93(5): 1510 - 1518. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ventura, C. Canchaya, Z. Zhang, G. F. Fitzgerald, and D. van Sinderen Molecular Characterization of hsp20, Encoding a Small Heat Shock Protein of Bifidobacterium breve UCC2003 Appl. Envir. Microbiol., July 15, 2007; 73(14): 4695 - 4703. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. I. Sadreyev, M. Tang, B.-H. Kim, and N. V. Grishin COMPASS server for remote homology inference Nucleic Acids Res., July 13, 2007; 35(suppl_2): W653 - W658. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Cheng DOMAC: an accurate, hybrid protein domain prediction server Nucleic Acids Res., July 13, 2007; 35(suppl_2): W354 - W356. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Przybylski and B. Rost Consensus sequences improve PSI-BLAST through mimicking profile-profile alignments Nucleic Acids Res., April 1, 2007; 35(7): 2238 - 2246. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Fariselli, I. Rossi, E. Capriotti, and R. Casadio The WWWH of remote homolog detection: The state of the art Brief Bioinform, March 1, 2007; 8(2): 78 - 87. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Awobuluyi, J. Yang, Y. Ye, J. E. Chatterton, A. Godzik, S. A. Lipton, and D. Zhang Subunit-Specific Roles of Glycine-Binding Domains in Activation of NR1/NR3 N-Methyl-D-aspartate Receptors Mol. Pharmacol., January 1, 2007; 71(1): 112 - 122. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Chivian and D. Baker Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection Nucleic Acids Res., October 18, 2006; 34(17): e112 - e112. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. V. Doukhanina, S. Chen, E. van der Zalm, A. Godzik, J. Reed, and M. B. Dickman Identification and Functional Characterization of the BAG Protein Family in Arabidopsis thaliana J. Biol. Chem., July 7, 2006; 281(27): 18793 - 18801. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. D. Romanyuk, D. J. Rigden, O. K. Vatamaniuk, A. Lang, R. E. Cahoon, J. M. Jez, and P. A. Rea Mutagenic Definition of a Papain-Like Catalytic Triad, Sufficiency of the N-Terminal Domain for Single-Site Core Catalytic Enzyme Acylation, and C-Terminal Domain for Augmentative Metal Activation of a Eukaryotic Phytochelatin Synthase Plant Physiology, July 1, 2006; 141(3): 858 - 869. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Cheng and P. Baldi A machine learning information retrieval approach to protein fold recognition Bioinformatics, June 15, 2006; 22(12): 1456 - 1463. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Fodor and R. W. Aldrich Statistical Limits to the Identification of Ion Channel Domains by Sequence Similarity J. Gen. Physiol., May 30, 2006; 127(6): 755 - 766. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Johnson, W. Peti, T. Herrmann, I. A. Wilson, and K. Wuthrich Solution structure of Asl1650, an acyl carrier protein from Anabaena sp. PCC 7120 with a variant phosphopantetheinylation-site sequence Protein Sci., May 1, 2006; 15(5): 1030 - 1041. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Obarska, A. Blundell, M. Feder, S. Vejsadova, E. Sisakova, M. Weiserova, J. M. Bujnicki, and K. Firman Structural model for the multisubunit Type IC restriction-modification DNA methyltransferase M.EcoR124I in complex with DNA Nucleic Acids Res., April 13, 2006; 34(7): 1992 - 2005. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Wallner and A. Elofsson Identification of correct regions in protein models using structural, alignment, and consensus information Protein Sci., April 1, 2006; 15(4): 900 - 913. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Rea Phytochelatin synthase, papain's cousin, in stereo PNAS, January 17, 2006; 103(3): 507 - 508. [Full Text] [PDF] |
||||
![]() |
J. E. Gewehr and R. Zimmer SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles Bioinformatics, January 15, 2006; 22(2): 181 - 187. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Newman, P. Salunkhe, A. Godzik, and J. C. Reed Identification and Characterization of a Novel Bacterial Virulence Factor That Shares Homology with Mammalian Toll/Interleukin-1 Receptor Family Proteins Infect. Immun., January 1, 2006; 74(1): 594 - 601. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ventura, Z. Zhang, M. Cronin, C. Canchaya, J. G. Kenny, G. F. Fitzgerald, and D. van Sinderen The ClgR Protein Regulates Transcription of the clpP Operon in Bifidobacterium breve UCC 2003 J. Bacteriol., December 15, 2005; 187(24): 8411 - 8426. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Wallner and A. Elofsson Pcons5: combining consensus, structural evaluation and fold recognition scores Bioinformatics, December 1, 2005; 21(23): 4248 - 4254. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. A. Cymerman, G. Meiss, and J. M. Bujnicki DNase II is a member of the phospholipase D superfamily Bioinformatics, November 1, 2005; 21(21): 3959 - 3962. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Zhou and Y. Zhou SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures Bioinformatics, September 15, 2005; 21(18): 3615 - 3621. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ventura, J. G. Kenny, Z. Zhang, G. F. Fitzgerald, and D. van Sinderen The clpB gene of Bifidobacterium breve UCC 2003: transcriptional analysis and first insights into stress induction Microbiology, September 1, 2005; 151(9): 2861 - 2872. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Friedberg and A. Godzik Fragnostic: walking through protein structure space Nucleic Acids Res., July 1, 2005; 33(suppl_2): W249 - W251. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Frenkel-Morgenstern, A. Singer, H. Bronfeld, and S. Pietrokovski One-Block CYRCA: an automated procedure for identifying multiple-block alignments from single block queries Nucleic Acids Res., July 1, 2005; 33(suppl_2): W281 - W283. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Jaroszewski, L. Rychlewski, Z. Li, W. Li, and A. Godzik FFAS03: a server for profile-profile sequence alignments Nucleic Acids Res., July 1, 2005; 33(suppl_2): W284 - W288. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Frenkel-Morgenstern, H. Voet, and S. Pietrokovski Enhanced statistics for local alignment of multiple alignments improves prediction of protein function and structure Bioinformatics, July 1, 2005; 21(13): 2950 - 2956. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Poleksic, J. F. Danzer, K. Hambly, and D. A. Debe Convergent Island Statistics: a fast method for determining local alignment score significance Bioinformatics, June 15, 2005; 21(12): 2827 - 2831. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. K. Saini and D. Fischer Meta-DP: domain prediction meta-server Bioinformatics, June 15, 2005; 21(12): 2917 - 2920. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Han, B.-c. Lee, S. T. Yu, C.-s. Jeong, S. Lee, and D. Kim Fold recognition by combining profile-profile alignment and support vector machine Bioinformatics, June 1, 2005; 21(11): 2667 - 2673. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Y. Kahsay, G. Wang, G. Gao, L. Liao, and R. Dunbrack Quasi-consensus-based comparison of profile hidden Markov models for protein sequences Bioinformatics, May 15, 2005; 21(10): 2287 - 2293. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. K. Shah, P. Aloy, P. Bork, and R. B. Russell Structural similarity to bridge sequence space: Finding new families on the bridges Protein Sci., May 1, 2005; 14(5): 1305 - 1314. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Ginalski, N. V. Grishin, A. Godzik, and L. Rychlewski Practical lessons from protein structure prediction Nucleic Acids Res., April 1, 2005; 33(6): 1874 - 1891. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Soding Protein homology detection by HMM-HMM comparison Bioinformatics, April 1, 2005; 21(7): 951 - 960. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. L. Barragan, B. Blazquez, M. T. Zamarro, J. M. Mancheno, J. L. Garcia, E. Diaz, and M. Carmona BzdR, a Repressor That Controls the Anaerobic Catabolism of Benzoate in Azoarcus sp. CIB, Is the First Member of a New Subfamily of Transcriptional Regulators J. Biol. Chem., March 18, 2005; 280(11): 10683 - 10694. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. A. Simossis, J. Kleinjung, and J. Heringa Homology-extended sequence alignment Nucleic Acids Res., February 7, 2005; 33(3): 816 - 824. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Zhang, S. Kochhar, and M. G. Grigorov Descriptor-based protein remote homology identification Protein Sci., February 1, 2005; 14(2): 431 - 444. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Lee, I. B. Kaplan, D. R. Ripoll, D. Liang, P. Palukaitis, and S. M. Gray A Surface Loop of the Potato Leafroll Virus Coat Protein Is Involved in Virion Assembly, Systemic Movement, and Aphid Transmission J. Virol., January 15, 2005; 79(2): 1207 - 1214. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Rychlewski and D. Fischer LiveBench-8: The large-scale, continuous assessment of automated protein structure prediction Protein Sci., January 1, 2005; 14(1): 240 - 245. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. RIGDEN Archaea recruited D-Tyr-tRNATyr deacylase for editing in Thr-tRNA synthetase RNA, December 1, 2004; 10(12): 1845 - 1851. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. L. Lytle, F. C. Peterson, S.-H. Qiu, M. Luo, Q. Zhao, J. L. Markley, and B. F. Volkman Solution Structure of a Ubiquitin-like Domain from Tubulin-binding Cofactor B J. Biol. Chem., November 5, 2004; 279(45): 46787 - 46793. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Becker, J. Hritz, M. Vogel, A. Caliebe, B. Bukau, J. Soll, and E. Schleiff Toc12, a Novel Subunit of the Intermembrane Space Preprotein Translocon of Chloroplasts Mol. Biol. Cell, November 1, 2004; 15(11): 5130 - 5144. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Rea, O. K. Vatamaniuk, and D. J. Rigden Weeds, Worms, and More. Papain's Long-Lost Cousin, Phytochelatin Synthase Plant Physiology, September 1, 2004; 136(1): 2463 - 2474. [Full Text] [PDF] |
||||
![]() |
J. M. BUJNICKI, Y. OUDJAMA, M. ROOVERS, S. OWCZAREK, J. CAILLET, and L. DROOGMANS Identification of a bifunctional enzyme MnmC involved in the biosynthesis of a hypermodified uridine in the wobble position of tRNA RNA, August 1, 2004; 10(8): 1236 - 1242. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. M. Bjerkan, C. L. Bender, H. Ertesvag, F. Drablos, M. K. Fakhr, L. A. Preston, G. Skjak-Braek, and S. Valla The Pseudomonas syringae Genome Encodes a Combined Mannuronan C-5-epimerase and O-Acetylhydrolase, Which Strongly Enhances the Predicted Gel-forming Properties of Alginates J. Biol. Chem., July 9, 2004; 279(28): 28920 - 28929. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. E. Kim, D. Chivian, and D. Baker Protein structure prediction and analysis using the Robetta server Nucleic Acids Res., July 1, 2004; 32(suppl_2): W526 - W531. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Ginalski, M. von Grotthuss, N. V. Grishin, and L. Rychlewski Detecting distant homology with Meta-BASIC Nucleic Acids Res., July 1, 2004; 32(suppl_2): W576 - W581. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Wang and R. L. Dunbrack Jr. Scoring profile-to-profile sequence alignments Protein Sci., June 1, 2004; 13(6): 1612 - 1626. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Bujnicki, M. Feder, C. L. Ayres, and K. L. Redman Sequence-structure-function studies of tRNA:m5C methyltransferase Trm4p and its relationship to DNA:m5C and RNA:m5U methyltransferases Nucleic Acids Res., April 30, 2004; 32(8): 2453 - 2463. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Marti-Renom, M.S. Madhusudhan, and A. Sali Alignment of protein sequences by their profiles Protein Sci., April 1, 2004; 13(4): 1071 - 1087. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Ye and A. Godzik Comparative Analysis of Protein Domain Organization Genome Res., March 1, 2004; 14(3): 343 - 353. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Madera, C. Vogel, S. K. Kummerfeld, C. Chothia, and J. Gough The SUPERFAMILY database in 2004: additions and improvements Nucleic Acids Res., January 1, 2004; 32(90001): D235 - 239. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Huynh, X. Wang, W. Li, N. Bottini, S. Williams, K. Nika, H. Ishihara, A. Godzik, and T. Mustelin Homotypic Secretory Vesicle Fusion Induced by the Protein Tyrosine Phosphatase MEG2 Depends on Polyphosphoinositides in T Cells J. Immunol., December 15, 2003; 171(12): 6661 - 6671. [Abstract] [Full Text] [PDF] |
||||
![]() |