Protein Science
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Supplemental Research Data
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Liu, J.
Right arrow Articles by Rost, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, J.
Right arrow Articles by Rost, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Protein Science (2001), 10:1970-1979.
Copyright © 2001 The Protein Society

Comparing function and structure between entire proteomes

Jinfeng Liu1,2 and Burkhard Rost1

1 CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
2 Graduate Program in Pharmacology, Columbia University, New York, New York 10032, USA

Reprint requests to: Burkhard Rost, CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168 Street, BB217, New York, New York 10032, USA; e-mail: rost{at}columbia.edu fax: (212) 305-7932.

More than 30 organisms have been sequenced entirely. Here, we applied a variety of simple bioinformatics tools to analyze 29 proteomes for representatives from all three kingdoms: eukaryotes, prokaryotes, and archaebacteria. We confirmed that eukaryotes have relatively more long proteins than prokaryotes and archaes, and that the overall amino acid composition is similar among the three. We predicted that ~15%–30% of all proteins contained transmembrane helices. We could not find a correlation between the content of membrane proteins and the complexity of the organism. In particular, we did not find significantly higher percentages of helical membrane proteins in eukaryotes than in prokaryotes or archae. However, we found more proteins with seven transmembrane helices in eukaryotes and more with six and 12 transmembrane helices in prokaryotes. We found twice as many coiled-coil proteins in eukaryotes (10%) as in prokaryotes and archaes (4%–5%), and we predicted ~15%–25% of all proteins to be secreted by most eukaryotes and prokaryotes. Every tenth protein had no known homolog in current databases, and 30%–40% of the proteins fell into structural families with >100 members. A classification by cellular function verified that eukaryotes have a higher proportion of proteins for communication with the environment. Finally, we found at least one homolog of experimentally known structure for ~20%–45% of all proteins; the regions with structural homology covered 20%–30% of all residues. These numbers may or may not suggest that there are 1200–2600 folds in the universe of protein structures. All predictions are available at http://cubic.bioc.columbia.edu/genomes.

Keywords: Protein sequence analysis; analyzing entire genomes; helical membrane proteins; coiled-coil proteins; signal peptides; comparative modeling

Abbreviations: 3D structure, three-dimensional structure (i.e., coordinates of all residues/atoms in a protein) • COILS, prediction of coiled-coil regions from sequence based on statistics and expert rules • ORF, open reading frame (protein predicted by genome-sequencing project) • PDB, protein data bank of protein structures • PHDhtm, profile-based neural network prediction of transmembrane helices • PSI-BLAST, fast and reliable database search method • SignalP, neural network based prediction of signal peptides • SWISS-PROT, curated database with protein sequences and functional annotations • TM, transmembrane helices • TrEMBL, automatic translation of EMBL nucleotide database of protein sequences


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
S. Montgomerie, J. A. Cruz, S. Shrivastava, D. Arndt, M. Berjanskii, and D. S. Wishart
PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotation
Nucleic Acids Res., July 1, 2008; 36(suppl_2): W202 - W209.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
T. D. Sutherland, S. Weisman, H. E. Trueman, A. Sriskantha, J. W. H. Trueman, and V. S. Haritos
Conservation of Essential Design Features in Coiled Coil Silks
Mol. Biol. Evol., November 1, 2007; 24(11): 2424 - 2432.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
Y. Ofran, V. Mysore, and B. Rost
Prediction of DNA-binding residues from sequence
Bioinformatics, July 1, 2007; 23(13): i347 - i353.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Carroll and V. Pavlovic
Protein classification using probabilistic chain graphs and the Gene Ontology structure
Bioinformatics, August 1, 2006; 22(15): 1871 - 1878.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. L. Marsden, D. Lee, M. Maibaum, C. Yeats, and C. A. Orengo
Comprehensive genome analysis of 203 genomes provides structural genomics with new insights into protein family space
Nucleic Acids Res., February 15, 2006; 34(3): 1066 - 1080.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Zoonens, L. J. Catoire, F. Giusti, and J.-L. Popot
From the Cover: NMR study of a membrane protein in detergent-free aqueous solution
PNAS, June 21, 2005; 102(25): 8893 - 8898.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
R. Y. Kahsay, G. Gao, and L. Liao
An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes
Bioinformatics, May 1, 2005; 21(9): 1853 - 1858.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. R. Chance, A. Fiser, A. Sali, U. Pieper, N. Eswar, G. Xu, J. E. Fajardo, T. Radhakannan, and N. Marinkovic
High-Throughput Computational and Experimental Techniques in Structural Genomics
Genome Res., October 1, 2004; 14(10b): 2145 - 2154.
[Abstract] [Full Text] [PDF]


Home page
Biophys. JHome page
T. Stockner, W. L. Ash, J. L. MacCallum, and D. P. Tieleman
Direct Simulation of Transmembrane Helix Association: Role of Asparagines
Biophys. J., September 1, 2004; 87(3): 1650 - 1656.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
M. Arai, K. Okumura, M. Satake, and T. Shimizu
Proteome-wide functional classification and identification of prokaryotic transmembrane proteins by transmembrane topology similarity comparison
Protein Sci., August 1, 2004; 13(8): 2170 - 2183.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
C. G. Knight, R. Kassen, H. Hebestreit, and P. B. Rainey
From The Cover: Global analysis of predicted proteomes: Functional adaptation of physical properties
PNAS, June 1, 2004; 101(22): 8390 - 8395.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
A. Rose, S. Manikantan, S. J. Schraegle, M. A. Maloy, E. A. Stahlberg, and I. Meier
Genome-Wide Identification of Arabidopsis Coiled-Coil Proteins and Establishment of the ARABI-COIL Database
Plant Physiology, March 1, 2004; 134(3): 927 - 939.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Z. Dosztanyi, C. Magyar, G. E. Tusnady, M. Cserzo, A. Fiser, and I. Simon
Servers for sequence-structure relationship analysis and prediction
Nucleic Acids Res., July 1, 2003; 31(13): 3359 - 3363.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
Y. Sugiyama, N. Polulyakh, and T. Shimizu
Identification of transmembrane protein functions by binary topology patterns
Protein Eng. Des. Sel., July 1, 2003; 16(7): 479 - 488.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
A. Kanapin, S. Batalov, M. J. Davis, J. Gough, S. Grimmond, H. Kawaji, M. Magrane, H. Matsuda, C. Schonbach, R. D. Teasdale, et al.
Mouse Proteome Analysis
Genome Res., June 1, 2003; 13(6): 1335 - 1344.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
R. Casadio, P. Fariselli, G. Finocchiaro, and P. L. Martelli
Fishing new proteins in the twilight zone of genomes: The test case of outer membrane proteins in Escherichia coli K12, Escherichia coli O157:H7, and other Gram-negative bacteria
Protein Sci., June 1, 2003; 12(6): 1158 - 1168.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Ikeda, M. Arai, T. Okuno, and T. Shimizu
TMPDB: a database of experimentally-characterized transmembrane topologies
Nucleic Acids Res., January 1, 2003; 31(1): 406 - 409.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Carter, J. Liu, and B. Rost
PEP: Predictions for Entire Proteomes
Nucleic Acids Res., January 1, 2003; 31(1): 410 - 413.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
C. P. Chen and B. Rost
Long membrane helices and short loops predicted less accurately
Protein Sci., December 1, 2002; 11(12): 2766 - 2773.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
C. P. Chen, A. Kernytsky, and B. Rost
Transmembrane helix predictions revisited
Protein Sci., December 1, 2002; 11(12): 2774 - 2791.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
R. Nair and B. Rost
Sequence conserved for subcellular localization
Protein Sci., December 1, 2002; 11(12): 2836 - 2847.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J. R. Litowski and R. S. Hodges
Designing Heterodimeric Two-stranded alpha -Helical Coiled-coils. EFFECTS OF HYDROPHOBICITY AND alpha -HELICAL PROPENSITY ON PROTEIN FOLDING, STABILITY, AND SPECIFICITY
J. Biol. Chem., September 27, 2002; 277(40): 37272 - 37279.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
K. G. Fleming and D. M. Engelman
Specificity in transmembrane helix-helix interactions can define a hierarchy of stability for sequence variants
PNAS, December 4, 2001; 98(25): 14340 - 14344.
[Abstract] [Full Text] [PDF]




HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2001 by The Protein Society.