|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
(RECEIVED April 24, 2006; FINAL REVISION April 24, 2006; ACCEPTED May 18, 2006)
| Abstract |
|---|
|
|
|---|
-helix,
-strand,
-turn, polyproline II, coil) were used to constrain polypeptide backbone chains devoid of side chains, and the most stable folded conformations were determined, using Monte Carlo simulation. Just three terms were used to assess stability: molecular compaction, steric exclusion, and hydrogen bonding. For nine of the 13 proteins, this protocol restricts the main chain to a surprisingly small number of energetically favorable topologies, with the native one prominent among them. Keywords: protein topology; protein folding; secondary structure; hydrogen bonding; confinement
| Introduction |
|---|
|
|
|---|
The proposition that secondary structure determines tertiary structure often provokes an argumentative response: Some think it true but trivial; others are sure it is demonstrably false. The true-but-trivial group believes that little conformational latitude remains once
-helices,
-strands, and
-turns are fixed. Accordingly, it is important to realize that rebuilding a protein in three-dimensions is actually a daunting task, even when starting from exact knowledge of backbone torsion angles (Holmes and Tsai 2004; Gong et al. 2005). Our level of prior structural knowledge is less restricted yet. Backbone torsion angles are not assumed, only the four broad secondary structure categories:
-helix,
-strand,
-turns, polyproline II (PII), or coil. The nontrivial nature of our approach is assessed via several controls, each of which omits one or more crucial aspects of the method. In contrast to the true-but-trivial group, the demonstrably-false group asserts that a given set of secondary structure elements can be interconnected in multiple ways, so we hasten to emphasize that peptide chain turns (
-turns) and PII were included in our secondary structure definitions.
Our approach is computational: using Monte Carlo simulation, an extended polypeptide backbone chain was allowed to fold spontaneously under the influence of three simple energy terms: molecular compaction, steric exclusion, and hydrogen bonding. The first two terms are generic. Under folding conditions, water exerts a solvent-squeezing force on protein molecules (Liu and Bolen 1995), favoring compaction but limited by steric exclusion (Ramachandran et al. 1963). Only the third termhydrogen-bonding between backbone donors and acceptors (Fleming and Rose 2005)differs from one protein to another. Prior to simulation, secondary structure assignments
-helix,
-strand,
-turns, and PIIwere extracted from 13 proteins of known structure. During the simulation, these assignments were used to constrain the main chain to corresponding broad regions of conformational space. All residues not included in these four secondary structure categories were classified as coil and allowed to sample all sterically allowed regions.
Anfinsen's thermodynamic hypothesis, that the native conformation is determined by the amino acid sequence (Anfinsen 1973), does not imply a folding mechanismdeliberately so. Nevertheless, the hypothesis has often been interpreted to mean that tertiary structure arises as a consequence of detailed side chain interactions in the folded state. Instead, the results presented here suggest a hierarchic folding mechanism (Rose 1979; Baldwin and Rose 1999a,b) in which secondary structure biases are established locally, leading iteratively to further, mutually stabilizing interactions and resulting ultimately in native topology. However, this process can be obscured by the overall coil-globule collapse of the protein (Sosnick et al. 1996) under suitable experimental conditions.
| Materials and methods |
|---|
|
|
|---|
|
|
|
and
, were allowed to sample values randomly within a range imposed by each residue's secondary structure assignment (see below). The peptide bond torsion angle,
, was varied randomly within the range 180° ± 5°. For each trial of
,
, and
angles, a residue was chosen at random and moved individually, except in the case of
-turns (see below), where backbone torsion angles for the two central residues (i + 1, i + 2) (Rose et al. 1985) were varied concomitantly.
Within this protocol, hydrogen bonds are not "wired-in." Rather, they are allowed to form or break spontaneously between any backbone donor and acceptor, subject to the Metropolis criterion. The hydrogen bond scoring function employed both distance and orientation criteria previously described (Fleming et al. 2005), with a maximum favorable score at a heavy atom donor-acceptor distance of
3.5 Å, decaying linearly to zero at a distance of 5 Å. A confinement score was devised to disfavor conformations with a geometric radius of gyration (Rg) larger than that predicted for a globular protein with the same number of residues (Rg-glob),
|
|
where Rg-glob = 2.5 x N0.34, and the prefactor was decreased slightly from that previously reported (Gong et al. 2005) to favor proteins with smaller than average Rg.
Secondary structure assignments for each protein were based solely on backbone torsion angles as previously described (Srinivasan and Rose 1999). Five standard categories were defined: extended (E), helix (H), turn (T), polyproline II (P), and coil (C), with turns subdivided into the six specific
-turn types: I, I', II, II', III, and III' (Rose et al. 1985). Every protein residue was assigned to one of these broad categories, and during Monte Carlo simulation, that residue's trial backbone torsion angles were chosen at random from the corresponding region in
,
-space. Residue-specific coil constraints were taken from observed distributions in the protein coil library (http://roselab.jhu.edu/coil/) (Fitzkee et al. 2005b).
,
-Maps of all backbone torsion angle categories can be found at http://roselab.jhu.edu/movesets/ and in the Supplemental Material.
An initial simulation of 10,000 cycles (a cycle is N 2 Monte Carlo moves, where N = the number of residues) was performed with only steric exclusion and a hydrogen bond score as the Metropolis criteria. The best scoring structure from this initial simulation was further simulated for an additional 50,000 cycles, with the confinement score added to the Metropolis criteria. Control experiments were performed using steric exclusion alone or steric exclusion plus either confinement or hydrogen bonding, but not both; all simulations were constrained by their secondary structure assignments.
For each protein, 400 independent simulations were performed starting from an extended chain, and the conformer with the best combined hydrogen bond and confinement score was saved from each simulation. These 400 saved conformers were then clustered by structural similarity, with structure characterized by a "structure-vector" that included both the backbone torsion angles and all C
(i) to C
(i + 6,...,N) distances (i.e., an abbreviated C
-distance matrix). The C
-distance matrix component of this structure-vector was normalized by a function of N to balance its contribution against that of the torsion angle component. Hierarchic centroid linkage clustering of these vectors was performed with a modified version of Pycluster (de Hoon et al. 2004) after correcting for the periodicity of angular data.
All root mean squared difference (RMSD) values refer to C
backbone comparisons. Ribbon diagrams were made with PyMOL (DeLano 2003), and Ramachandran plots were made using Grace (http://plasma-gate.weizmann.ac.il/Grace/).
| Results and Discussion |
|---|
|
|
|---|
Control 1
Conformers satisfying secondary structure constraints and steric exclusion only are illustrated in Figure 1A. The ensemble resembles a random coil population (Fitzkee and Rose 2004), with a mean geometric radius of gyration, <Rg>, of 24.7 ± 3.7 Å, similar to the value expected (22.4 Å) for a random coil in good solvent (de Gennes 1979). Helical segments are recognizable, but in the absence of hydrogen bonding, no
-hairpins or
-sheets were formed, and no conformers with native-like topologies were obtained, which is consistent with similar results reported earlier (Alexandrescu 2004). This control is a vivid demonstration that secondary structure constraints alone are insufficient to capture the fold.
|
-helices and
-hairpins, but no
-sheet is formed, <Rg> resembles a random coil (21.6 ± 2.6 Å), and again the native-like topology is not observed (data not shown). The combination of confinement and hydrogen bonding gives results that are dramatic. As seen in Figure 1C, the largest cluster includes many conformers with native topologies (Table 1). It should be emphasized that this result was achieved in the absence of side chain interactions; only secondary structure constraints and the three scoring functions were included.
Results for 13 proteins
Protein G is not unique. Simulations of eight additional proteins resulted in conformational ensembles that included native topologies, as seen in Figure 2. Table 1 summarizes results for all 13 proteins studied here. In each case, only one or two major structural clusters were identified in the simulated conformational ensemble, with the native topology well represented in the larger cluster for at least nine proteins.
It is common practice to assess similarity between two structures by their RMSD, an imperfect measure that reduces a complex comparison involving hundreds of vectors to a single scalar number. We conform to this practice here by citing RMSD values, but with the following reservation. A small RMSD between two proteins indicates structural similarity, but the converse need not be true; two proteins can be topologically similar yet have a large RMSD between them.
In most proteins in Table 1, many conformers in the larger cluster have both native topology and RMSD <6 Å (columns 67). Among the 13 proteins studied here, 10 have a population with RMSD
6 Å, but three fall in a different category, with RMSD >6 Å for the most accurate conformer. These results demonstrate that secondary structure alone appears to be sufficient to define topology for small, single-domain proteins under folding conditions, although larger molecules with more complicated topologies may require inclusion of longer-range interactions. These results are consistent with our own previous fragment-assembly simulations (Gong et al. 2005) and with the fragment-assembly method of Chikenji et al. (2006) that included both local and long-range side chain interactions.
Four of the 13 test proteins resulted in less accurate topologies (Fig. 3). Although the helices in the all-
proteins, 1vii
[PDB]
and 1r69, do describe the native chain path correctly, their rotational orientations would not engender a hydrophobic core if side chains were added (data not shown). This is to be expected because an
-helix can sample multiple rotational angles while maintaining the same overall orientation, and consequently, long-range interactions are probably required for further improvement in these cases as well as in larger proteins with more complicated topologies.
Summary
Long-range interactions play an important role in determining the details of protein structure (Frieden 2003; Kihara 2005). But evidence is accumulating that local interactions are the primary determinant of secondary structure (Baldwin and Rose 1999a) and that secondary structure, in turn, delimits overall topology (Przytycka et al. 1999; Hoang et al. 2004; Gong et al. 2005). Even so, it is clear that approximate
,
torsion angles alone are insufficient to determine tertiary folds (Alexandrescu 2004). Here, we use Monte Carlo simulations to show that correct secondary structure assignments (
-helix,
-strand,
-turn, polyproline II, and coil), together with global confinement and maximal hydrogen bonding, can capture the topology of a test set of small globular proteins in the absence of long-range interactions. Guided by these factors, the conformational search is narrowed significantly and the correct topology is sampled frequently, providing opportunities for additional stabilizing interactions, including those involving side chains. Very recently, Zhang et al. (2006) also concluded that a combination of compaction and hydrogen bonded elements of secondary structure limits the number of feasible folds.
| Footnotes |
|---|
Reprint requests to: George D. Rose, T.C. Jenkins Department of Biophysics, Johns Hopkins University, 3400 N. Charles Street, Baltimore, MD 21218; e-mail: grose{at}jhu.edu; fax: (410) 516-4118.
Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.062305106.
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
Anfinsen C.B. 1973. Principles that govern the folding of protein chains. Science 181: 223230.
Baldwin R.L. and Rose G.D. 1999a. Is protein folding hierarchic? I: Local structure and peptide folding. Trends Biochem. Sci. 24: 2633.[CrossRef][Medline]
Baldwin R.L. and Rose G.D. 1999b. Is protein folding hierarchic? II: Folding intermediates and transition states. Trends Biochem. Sci. 24: 7783.[CrossRef][Medline]
Bradley P., Misura M.S., Baker D. 2005. Toward high-resolution de novo structure prediction for small proteins. Science 309: 18681871.
Chandonia J.M., Hon G., Walker N.S., Lo Conte L., Koehl P., Levitt M., Brenner S.E. 2004. The ASTRAL compendium in 2004. Nucleic Acids Res. 32: D189D192.
Chikenji G., Fujitsuka Y., Takada S. 2006. Shaping up the protein folding funnel by local interaction: Lesson from a structure prediction study. Proc. Natl. Acad. Sci. 103: 31413146.
de Gennes P.-G. In Scaling concepts in polymer physics . 1979. Wiley, New York.
de Hoon M.J.L., Imoto S., Nolan J., Miyano S. 2004. Open source clustering software. Bioinformatics 20: 14531454.
DeLano W. In The PyMOL molecular graphics system . 2003. DeLano Scientific LLC, San Carlos, CA.
Fitzkee N.C. and Rose G.D. 2004. Reassessing random-coil statistics in unfolded proteins. Proc. Natl. Acad. Sci. 101: 1249712502.
Fitzkee N.C., Fleming P.J., Gong H., Panasik Jr. N., Street T.O., Rose G.D. 2005a. Are proteins made from a limited parts list? Trends Biochem. Sci. 30: 7380.[CrossRef][Medline]
Fitzkee N.C., Fleming P.J., Rose G.D. 2005b. The Protein Coil Library: A structural database of nonhelix, nonstrand fragments derived from the PDB. Proteins 58: 852854.[CrossRef][Medline]
Fleming P.J. and Rose G.D. 2005. Do all backbone polar groups in proteins form hydrogen bonds? Protein Sci. 14: 19111917.
Fleming P.J., Fitzkee N.C., Mezei M., Srinivasan R., Rose G.D. 2005. A novel method reveals that solvent water favors polyproline II over
-strand conformation in peptides and unfolded proteins: Conditional hydrophobic accessible surface area (CHASA). Protein Sci. 14: 111118.
Frieden C. 2003. The kinetics of side chain stabilization during protein folding. Biochemistry 42: 1243912446.[CrossRef][Medline]
Gong H. and Rose G.D. 2005. Does secondary structure determine tertiary structure in proteins? Proteins 61: 338343.[CrossRef][Medline]
Gong H., Fleming P.J., Rose G.D. 2005. Building native protein conformation from highly approximate backbone torsion angles. Proc. Natl. Acad. Sci. 102: 1622716232.
Hoang T.X., Trovato A., Seno F., Banavar J.R., Maritan A. 2004. Geometry and symmetry presculpt the free-energy landscape of proteins. Proc. Natl. Acad. Sci. 101: 79607964.
Holmes J.B. and Tsai J. 2004. Some fundamental aspects of building protein structures from fragment libraries. Protein Sci. 13: 16361650.
Kihara D. 2005. The effect of long-range interactions on the secondary structure formation of proteins. Protein Sci. 14: 19551963.
Liu Y. and Bolen D.W. 1995. The peptide backbone plays a dominant role in protein stabilization by naturally occurring osmolytes. Biochemistry 34: 1288412891.[CrossRef][Medline]
Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H., Teller E. 1953. Equation of state calculations by fast computing machines. J. Chem. Phys. 21: 10871092.[CrossRef]
Przytycka T., Aurora R., Rose G.D. 1999. A protein taxonomy based on secondary structure. Nat. Struct. Biol. 6: 672682.[CrossRef][Medline]
Ramachandran G.N., Ramakrishnan C., Sasisekharan V. 1963. Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7: 9599.[Medline]
Richards F.M. 1977. Areas, volumes, packing, and protein structure. Annu. Rev. Biophys. Bioeng. 6: 151176.[CrossRef][Medline]
Rose G.D. 1979. Hierarchic organization of domains in globular proteins. J. Mol. Biol. 134: 447470.[CrossRef][Medline]
Rose G.D., Gierasch L., Smith J.A. 1985. Turns in peptides and proteins. In Advances in protein chemistry pp. 1109. Academic Press, New York.
Sosnick T.R., Mayne L., Englander S.W. 1996. Molecular collapse: The rate-limiting step in two-state cytochrome c folding. Proteins 24: 413426.[CrossRef][Medline]
Srinivasan R. and Rose G.D. 1999. A physical basis for protein secondary structure. Proc. Natl. Acad. Sci. 96: 1425814263.
Zhang Y., Hubner I.A., Arakaki A.K., Shakhnovich E., Skolnick J. 2006. On the origin and highly likely completeness of single-domain protein structures. Proc. Natl. Acad. Sci. 103: 26052610.
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
C. Liang, P. Derreumaux, N. Mousseau, and G. Wei The {beta}-Strand-Loop-{beta}-Strand Conformation Is Marginally Populated in {beta}2-Microglobulin (20-41) Peptide in Solution as Revealed by Replica Exchange Molecular Dynamics Simulations Biophys. J., July 15, 2008; 95(2): 510 - 517. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. L. Perskie, T. O. Street, and G. D. Rose Structures, basins, and energies: A deconstruction of the Protein Coil Library Protein Sci., July 1, 2008; 17(7): 1151 - 1161. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Gong, Y. Shen, and G. D. Rose Building native protein conformation from NMR backbone chemical shifts using Monte Carlo fragment assembly Protein Sci., August 1, 2007; 16(8): 1515 - 1521. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. D. Rose, P. J. Fleming, J. R. Banavar, and A. Maritan A backbone-based theory of protein folding PNAS, November 7, 2006; 103(45): 16623 - 16633. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |