Protein Science (2004), 13:1787-1801.
Published by Cold Spring Harbor Laboratory Press. Copyright © 2004 The Protein Society
Analysis of the "thermodynamic information content" of a Homo sapiens structural database reveals hierarchical thermodynamic organization
Scott A. Larson and
Vincent J. Hilser
Department of Human Biological Chemistry and Genetics, University of Texas Medical Branch, Galveston, Texas 77555-1068, USA
(RECEIVED February 23, 2004;
FINAL REVISION April 23, 2004;
ACCEPTED April 23, 2004)
Abstract
Classification of the amounts and types of lower order structural elements in proteins is a prerequisite to effective comparisons between protein folds. In an effort to provide an additional vehicle for fold comparison, we present an alternative classification scheme whereby protein folds are represented in statistical thermodynamic terms in such a way as to illuminate the energetic building blocks within protein structures. The thermodynamic relationship is examined between amino acid sequences and the conformational ensembles for a database of 159 Homo sapiens protein structures ranging from 50 to 250 amino acids. Using hierarchical clustering, it is shown through fold-recognition experiments that (1) eight thermodynamic environmental descriptors sufficiently accounts for the energetic variation within the native state ensembles of the H. sapiens structural database, (2) an amino acid library of only six residue types is sufficient to encode >90% of the thermodynamic information required for fold specificity in the entire database, and (3) structural resolution of the statistically derived environments reveals sequential cooperative segments throughout the protein, which are independent of secondary structure. As the first level of thermodynamic organization in proteins, these segments represent the thermodynamic counterpart to secondary structure.
Keywords: native state ensemble; sequential cooperative segments; fold recognition; protein structure prediction; position-specific thermodynamics; protein stability
Abbreviations: PDB, Protein Data Bank PAM, Partitioning Around Medoids ASA, accessible surface area SCOP, structural classification of proteins FSSP, families of structurally similar proteins
Reprint requests to: Vincent J. Hilser, Department of Human Biological Chemistry and Genetics, 5.162 Medical Research Bldg., University of Texas Medical Branch, Galveston, TX 77555-1068, USA; e-mail: vince{at}hbcg.utmb.edu; fax: (409) 747-6816.
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.04706204.

CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
Copyright © 2004 by The Protein Society.