|
|
||||||||
1 EMBL, Meyerhofstrasse, 1. D-69117, Heidelberg, Germany
2 Institut de Biotecnologia i Biomedicina, Universitat Autonoma de Barcelona, Bellaterra 08193, Barcelona, Spain
The current pace of structural biology now means that protein three-dimensional structure can be known before protein function, making methods for assigning homology via structure comparison of growing importance. Previous research has suggested that sequence similarity after structure-based alignment is one of the best discriminators of homology and often functional similarity. Here, we exploit this observation, together with a merger of protein structure and sequence databases, to predict distant homologous relationships. We use the Structural Classification of Proteins (SCOP) database to link sequence alignments from the SMART and Pfam databases. We thus provide new alignments that could not be constructed easily in the absence of known three-dimensional structures. We then extend the method of Murzin (1993b) to assign statistical significance to sequence identities found after structural alignment and thus suggest the best link between diverse sequence families. We find that several distantly related protein sequence families can be linked with confidence, showing the approach to be a means for inferring homologous relationships and thus possible functions when proteins are of known structure but of unknown function. The analysis also finds several new potential superfamilies, where inspection of the associated alignments and superimpositions reveals conservation of unusual structural features or co-location of conserved amino acids and bound substrates. We discuss implications for Structural Genomics initiatives and for improvements to sequence comparison methods.
Keywords: Protein structure; sequence; function; homology; structural genomics
Abbreviations: 3D, three dimensional Ig, immunoglobulin RMSD, root mean square deviation PDB, Protein Data Bank ATP, adenosine triphosphate SCOP, structural classification of proteins NCBI, National Center for Biotechnology Information URL, universal resource locator
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
P. K. Shah, P. Aloy, P. Bork, and R. B. Russell Structural similarity to bridge sequence space: Finding new families on the bridges Protein Sci., May 1, 2005; 14(5): 1305 - 1314. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. I. Sadreyev, D. Baker, and N. V. Grishin Profile-profile comparisons by COMPASS predict intricate homologies between protein families Protein Sci., October 1, 2003; 12(10): 2262 - 2272. [Abstract] [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |