Protein Science (2002), 11:2437-2455.
Copyright © 2002 The Protein Society
A simple model for polyproline II structure in unfolded states of alanine-based peptides
Rohit V. Pappu1 and
George D. Rose2
1 Department of Biomedical Engineering and Center for Computational Biology, Washington University in St. Louis, St. Louis, Missouri 63130, USA
2 The Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
Reprint requests to: Rohit V. Pappu, Department of Biomedical Engineering and Center for Computational Biology, Washington University in St. Louis, One Brookings Drive, Campus Box 1097, St. Louis, Missouri 63130, USA; e-mail: pappu{at}biomed.wustl.edu.
(RECEIVED May 30, 2002;
FINAL REVISION July 16, 2002;
ACCEPTED July 17, 2002)
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.0217402.
 |
Abstract
|
|---|
The striking similarity between observed circular dichroism spectra of nonprolyl homopolymers and that of regular left-handed polyproline II (PII) helices prompted Tiffany and Krimm to propose in 1968 that unordered peptides and unfolded proteins are built of PII segments linked by sharp bends. A large body of experimental evidence, accumulated over the past three decades, provides compelling evidence in support of the original hypothesis of Tiffany and Krimm. Of particular interest are the recent experiments of Shi et al. who find significant PII structure in a short unfolded alanine-based peptide. What is the physical basis for PII helices in peptide and protein unfolded states? The widely accepted view is that favorable chain-solvent hydrogen bonds lead to a preference for dynamical fluctuations about noncooperative PII helices in water. Is this preference simply a consequence of hydrogen bonding or is it a manifestation of a more general trend for unfolded states which are appropriately viewed as chains in a good solvent? The prevalence of closely packed interiors in folded proteins suggests that under conditions that favor folding, waterwhich is a better solvent for itself than for any polypeptide chainexpels the chain from its midst, thereby maximizing chain packing. Implicit in this view is a complementary idea: under conditions that favor unfolding, chain-solvent interactions are preferred and in a so-called good solvent, chain packing density is minimized. In this work we show that minimization of chain packing density leads to preferred fluctuations for short polyalanyl chains around canonical, noncooperative PII-like conformations. Minimization of chain packing is modeled using a purely repulsive soft-core potential between polypeptide atoms. Details of chain-solvent interactions are ignored. Remarkably, the simple model captures the essential physics behind the preference of short unfolded alanine-based peptides for PII helices. Our results are based on a detailed analysis of the potential energy landscape which determines the system's structural and thermodynamic preferences. We use the inherent structure formalism of Stillinger and Weber, according to which the energy landscape is partitioned into basins of attraction around local minima. We find that the landscape for the experimentally studied seven-residue alanine-based peptide is dominated by fluctuations about two noncooperative structures: the left-handed polyproline II helix and its symmetry mate.
Keywords: Configurational mapping; energy landscape; polyproline II helices; inherent structures; random-coil; packing density
Abbreviations: CD, circular dichroism PII, left-handed polyproline II helix NMR, nuclear magnetic resonance NOE, nuclear Overhauser enhancement sPII, symmetry mate of left-handed polyproline II helix T, temperature Ux, potential energy of conformation x Z, configurational sum
 |
Introduction
|
|---|
Folding reactions for two-state globular proteins are well described within the framework of equilibrium thermodynamics (Ginsburg and Carroll 1965; Anfinsen 1973). Under appropriate physiological conditions, the native state (N) emerges spontaneously and reversibly from the unfolded population (U) (Anfinsen 1973). A comprehensive understanding of the folding reaction requires knowledge of conformers populated in U and N under different conditions (Kauzmann 1959; Tanford 1968; Dill and Shortle 1991; Lattman and Rose 1993; Serrano 1995; Shortle 1996; O'Connell et al. 1999; Shortle and Ackerman 2001; van Günsteren et al. 2001; Zhou and Dill 2001; Klein-Seetharaman et al. 2002). In addition to their obvious importance in protein folding, unfolded states also appear to be physiologically important (Plaxco and Gross 1997; Dobson 1999; Wright and Dyson 1999; Dunker et al. 2001; Tsai et al. 2001; Uversky 2002).
Twenty-five years ago, Richards (1977) analyzed protein crystal structures and concluded that mean packing densities for the interior of folded proteins are approximately 0.75. The packing density, a dimensionless ratio, measures the degree to which compounds are either liquid-like or solid-like, and the observed mean value of 0.75 resembles that of close-packed spheres of uniform size. Although further refined in later work (Richards and Lim 1994), these early conclusions remain valid descriptions of protein native states. In the Richards paradigm, the central question in protein folding becomes: how do attractive intrachain interactions, subject to excluded volume restrictions and the constraints of chain connectivity, conspire to maximize the packing density?
Here, we explore the complementary idea that polypeptides chains minimize their packing density on unfolding. For two-state proteins (Jackson 1998) and peptides, we propose that upon unfolding, packing density is minimized, while upon folding, packing density is maximized, subject in both cases to constraints imposed by chain connectivity and excluded volume (Ramachandran and Sasisekharan 1968; Richards 1977).
The term "unfolded state" is used to describe the collection of conformers populated under extreme nonnative conditions, including high temperature, high pressure, extremes of pH, and high concentrations of denaturant (Shortle 1996). An unfolded polypeptide chain is equivalent to a chain in a "good solvent" (Flory 1953; Chan and Dill 1991), and its behavior can be modeled using a purely repulsive potential function (Flory 1953; deGennes 1976; Binder and Heerman 1997). In other words, under good-solvent conditions, chain-solvent interactions are maximized; as a consequence the packing density of the chain around itself is minimized. To capture this effect, we use an inverse power potential (Brant et al. 1967; Hoover et al. 1971; Stillinger and Weber 1985) that treats interatomic interactions as soft repulsions. Conformers favored by this potential are free of steric clashes, and pairwise interatomic distances are maximized with respect to hard-sphere contact distances. The chosen inverse power potential prefers conformers that minimize the covolume (Flory 1969) or shared volume between pairs of chain monomers, subject of course to the constraints imposed by chain connectivity. This behavior satisfies our requirement that chain packing density be minimized.
According to a widely accepted model, unfolded polypeptide chains can explore conformational space freely subject to the constraints imposed by local interactions between nearest neighbor residues (Smith et al. 1996). To a first approximation, local restrictions refer to steric effects captured in the well-known Ramachandran map for a dipeptide (Ramachandran et al. 1963; Ramachandran and Sasisekharan 1968). However, in recent work (Pappu et al. 2000), we showed that steric interactions between nonnearest neighbors along a backbone should lead unfolded peptides toward extended conformations in which each (
,
)-pair preferentially populates the top left quadrant of a Ramachandran map. All extended conformers are free of steric clashes. However, as shown in this work, only a small fraction of allowed conformers satisfy the additional restriction that the packing density of chain around itself be minimized.
We have applied the idea of minimizing chain packing density to two systems studied in recent experiments, an alanine dipeptide or blocked alanine (Poon et al. 2000; Woutersen and Hamm 2000; Schweitzer-Stenner et al. 2001) and a seven-residue alanine peptide (Shi et al. 2002). In each system, we find that preferred conformers fluctuate about the (
,
)-values of canonical left-handed polyproline II (PII) helices and their "symmetry mates" (sPII).
Longer polyalanine peptides "fold" to form stable
-helices in water (Scholtz et al. 1991). In comparison to the noncooperative PII-helices (Adzhubei et al. 1987) obtained by minimization of chain packing density, the
-helix is a compact structure that results from maximizing chain packing density due to the formation of cooperative attractive intrachain hydrogen bonds (Pauling et al. 1951) and van der Waals contacts.
Our results are in general agreement with recent experimental data (Poon et al. 2000; Woutersen and Hamm 2000; Schweitzer-Stenner et al. 2001; Shi et al. 2002). They are also consistent with work that suggests a dominant role for PII-helices in unfolded populations for a variety of systems (Woody 1992) including homopolymers (Holzwarth and Doty 1965; Tiffany and Krimm 1968), peptide fragments excised from glycoproteins (Matsumoto et al. 1983), oligomers that are based on residues other than alanine (Rucker and Creamer 2002), and proteins (Adzhubei et al. 1987; Smyth et al. 2001). Our results for the dipeptide agree with many of the general trends seen in numerous molecular dynamics simulations that include all the details of intrachain and chain-solvent interactions (Rossky et al. 1979; Anderson and Hermans 1988; Pettitt and Karplus 1988; Roterman et al. 1989; Schmidt and Fine 1994; O' Connell et al. 1999; Smith 1999; Tobias and Brooks 1999), although our method of analysis with a simple model uncovers PII as the lowest energy conformation for the dipeptide and the 7-mer. We also find good agreement with quantum mechanical calculations that include the effect of hydrogen bonded water molecules (Grant et al. 1990; Jalkanen and Suhai 1996; Han et al. 1998). It is clear that our good-solvent model, based on the proposal that peptides unfold by minimizing chain packing, unequivocally captures the specificity of short unfolded alanine-based peptides to fluctuate about noncooperative PII-like conformers, in agreement with the recent experiments of Kallenbach and coworkers (Shi et al. 2002).
 |
Results
|
|---|
Conceptual framework
Our goal is to calculate the energies associated with each conceivable conformer in a polyalanine peptide N-Acetyl-Alav-N`-Methyl-amide, where |gn denotes the number of alanine residues. Energies are computed using a purely repulsive inverse power potential (Hoover et al. 1971; Stillinger and Weber 1985) that is akin to the repulsive term of a classical 612 Lennard-Jones potential function (see Methods). The only degrees of conformational freedom are the peptide-backbone (
,
)-angles. Unique chain conformations are derived by specifying (
,
)-values for each pair. The Boltzmann-weighted distribution of energies as a function of temperature, that is, distribution of relative free energies, is analyzed to determine whether fluctuations occur about preferred regions in conformational space.
It is impossible to visualize multidimensional energy landscapes for long chains, but an elegant approach introduced by Stillinger and Weber (1984,1985) overcomes this obstacle. Conformational basins are identified by a mapping that connects all conformers to their corresponding local minima. Briefly, the Boltzmann-weighted sum over all conceivable conformers (Z) for a system in a canonical ensemble may be written as:
Ui is the potential energy for conformation i and kB is the Boltzmann constant. Z, referred to as a configurational sum, acts as an estimate of the canonical partition function (Chandler 1987). The energy Ui of a single conformation can be rewritten as Ui = {U
o + (Ui -U
o)} = U
o +
U
i, where U
o is the energy of the local minimum obtained by an energy minimization with i as the starting conformation. The sum over all conformations is rewritten as a sum over basins:
where {i} refers to the set of unique conformations that can be mapped to the local minimum
o and ß = 1/kBT. The likelihood P
of finding the system in basin
is:
In the descriptive terminology of Stillinger and Weber (1984), mapping by energy minimization is configurational mapping; local minima are inherent structures, and the set of conformations that map to an inherent structure constitute a basin. Thermodynamic preferences are expressed as basin occupation probabilities, and structural preferences are identified by analyzing the conformations associated with inherent structures or local minima. Configurational mapping for a simple one-dimensional function is illustrated in Figure 1
.
The alanine dipeptide
Energetic preferences for the inverse-power potential function can be visualized from the energy landscape for the alanine dipeptide, which has just two degrees of freedom,
and
. The energy landscape as a function of
and
obtained from a coarse sampling is shown in Figure 2
. To obtain a rigorous mapping of the energy landscape, an elaborate enumeration of conformations was carried out. A total of 4x108 uniformly distributed random conformers were generated, and the inverse-power potential energy was calculated for each.
Configurational mapping was used to identify basins of attraction and quantify energetic preferences. A two-step energy minimization (see Methods) was performed for each of the 4x108 distinct dipeptide conformations; results are summarized in Table 1
. Ten unique minima were found on the dipeptide energy-surface. The locations of these minima and their respective catchment regions are shown in Figure 3
, where all points that map to a local minimum are the same color.

View larger version (110K):
[in this window]
[in a new window]
|
Fig. 3. The ten inherent structure basins from configurational mapping. Within each basin, the local minimum on the potential energy surface is marked with an "x". Same-color points map to the same local minimum. Points within the three highest energy basins are shown in white, and the boundaries for these basins are clear in Figure 2 . Only 1/1000th of the sampled points were used to create the image; blank spots within a basin reflect this coarse-graining.
|
|
The global minimum for the dipeptide corresponds to the (
,
)-values for a PII-helix
The global minimum is situated at (
,
) = (-81.81°, 146.74°). If all (
,
)-pairs for an oligomer assume similar values, the resultant conformation is a helix about some long axis in space (Ramachandran and Sasisekharan 1968). In this case the molecule would be a regular left-handed polyproline II (PII) helix characterized by three residues per turn. Ideal (
,
) values in a canonical PII helix vary in the literature. Ramachandran and Sasisekharan (1968) showed that for standard peptide geometries,
= 180° and the N-C
-C` bond-angle set at 110° the (
,
) = -77.2°, 145.9°) generates a left-handed three-residue per turn helix with a pitch of 9.12Å. For the rigid peptide geometry chosen in this work, the (
,
) values at the global minimum for the dipeptide generate a regular three-residue per turn PII helix with a pitch of 9.07 Å. The second lowest energy conformer for the dipeptide is at (
,
) = (-147.42°, 80.94°), and the corresponding helix is a three-residue per turn "symmetry mate" of the PII helix referred to hereafter as sPII.
Ramachandran and Sasisekharan (1968) used the Cahn-Infold-Prelog (Cahn et al. 1966) designation for the different conceivable helices. According to this convention, the sPII helix would be referred to as a P31L helix, where P is the Cahn et al. notation for right-handed axial chirality, 31 denotes a three-residue per turn helix, and L indicates that the (
,
)-value is in the allowed region for L amino acids (that are not proline). According to this convention, the PII helix would be referred to as the M31L helix, where M is the Cahn et al. designation for left-handed chirality. For simplicity we use the notations PII and sPII. The two regular helices, PII and sPII, for a blocked alanine 7-mer are shown in Figure 4
.
For purely repulsive inverse power potentials, dipeptide-conformers that maximize their pairwise interatomic distances with respect to the hard-sphere contact distances are preferred. In detail, for all atom pairs i and j, as the ratio of interatomic distances to hard-sphere contact distances (rij/
ij) increases (subject to covalent constraints), the dipeptide energy decreases. When dipeptide conformers are drawn from the allowed regions of a hard-sphere Ramachandran map (Ramachandran et al. 1963; Pappu et al. 2000), all interatomic distances satisfy the condition (rij/
ij
1). At the global minimum (PII), there are no interatomic contacts with (rij/
ij) <1.1 and only two contacts with (rij/
ij) <1.25. In comparison, at the penultimate minimum (sPII), the side-chain methyl group is closer to the adjacent carbonyl oxygen atom, resulting in a small increase in energy with respect to the global minimum (Table 1
).
Conformational basins
As shown in Figure 3
, the basin catchment regions differ in size. The set of 4x108 conformers was used to estimate the configurational sum (Z) at a specified temperature, T, which in turn was used to calculate temperature-dependent occupation probabilities for the ten basins. The temperatures used in this work are not directly related to physical temperatures measured in experiments because we made no attempt to parameterize the energies to give agreement with small molecule experimental data. Our temperature scale is therefore a nonphysical temperature scale. Basin occupation probabilities change with temperature, and basins that encompass a large number of low-energy conformers are preferentially populated (Fig. 5
).
The fraction of 4x108 sampled points required to account for 99.9% of the configurational sum (Z) at two different "temperatures", T = 100K and T = 300K is shown in Figure 6
. Estimating the fraction of thermodynamically relevant conformers is a novel approach for assaying the results of conformational sampling (Pappu et al. 2000). Conformational energies are sorted in ascending order, and the number of conformers required to account for some stipulated fraction of the sampled configurational sum is calculated. This approach bears close resemblance to the method of most probable distributions in equilibrium statistical mechanics (Chandler 1987).

View larger version (23K):
[in this window]
[in a new window]
|
Fig. 6. Temperature-dependent envelopes for alanine dipeptide basins from configurational mapping. The points required to account for 99.9% of the configurational sum calculated using all 4x108 conformers is shown at two temperatures: (a) T = 300K and (b) T = 100K. In order of thermodynamic preference, the five basins are: blue (p), red (e), green (r), magenta (i), and cyan (l). Populations associated with each basin are shown.
|
|
In Figure 6
, envelopes of points are color-coded by their basin memberships. These envelopes measure the contributions due to conformational entropy. As temperature increases the number of thermodynamically relevant basins will increase; additionally the number of relevant conformers within a basin will also increase. Conversely, as temperature decreases, the population of thermodynamically relevant conformers approaches the immediate vicinity of the global minimum. At T = 300K, only five of the ten basins have significant occupation probabilities.
How do our results for the dipeptide compare with theory and experiment?
Poon et al. (2000) studied the conformations of blocked alaninethe alanine dipeptidein an aqueous liquid crystal which mimics the behavior of the dipeptide in water, a good solvent for oligopeptides. They interpreted measured residual dipolar couplings from a proton NMR spectrum in terms of one dominant conformation (
= -85°,
= 160°). Although the
-angle is larger than canonical values, Poon et al. refer to the dominant conformer as the (
,
)-value for a PII-helix. The inherent plasticity of PII-helices (Adzhubei et al. 1987) justifies their interpretation.
Woutersen and Hamm (2000) used polarization-sensitive two-dimensional vibrational spectroscopy to study the backbone conformations for trialanine in aqueous solution. Trialanine is similar to the alanine dipeptide in that it has a similar number of degrees of freedom if one assumes that the terminal
and
-angles are in a trans conformation. They conclude that the dominant conformer for trialanine has (
,
)-values of (-60°, 140°).
The experimental work of Poon et al was prompted by earlier theoretical analysis (Grant et al. 1990; Han et al. 1998) for the alanine dipeptide. Starting from eight gas-phase energy minima found by Jalkanen and Suhai (1996), Han et al. (1998) studied the effect of hydration using a density functional approach. They found the dipeptide energies with four coordinated water molecules to be quite different from the gas-phase energies. The global minimum with hydration is in the vicinity of the PII conformation. (
,
)-values and relative density functional energies for the seven low energy structures found by Han et al. (1998) are shown in Table 2
. Also shown are energies calculated using the inverse power potential for these seven structures (column 4, Table 2
). The rank ordering of conformations by energy is almost identical, although quantitative agreement of relative energies is poor, as would be expected for a model that ignores the details of peptide-solvent interactions.
What is the source of the good qualitative agreement between our results and those of Han et al.?
This is an especially important question because the results of Han et al. (1998) were obtained using four water molecules coordinated around the two peptide groups of the alanine dipeptide, whereas we ignored all details of intrachain and dipeptide-solvent interactions. The process of solvation may be split into two steps (Pieroti 1965): (1) the creation of a cavity of appropriate size and shape to accommodate the solute, and (2) introduction of the solute into the cavity to interact favorably with solvent. Our calculation addresses step (1) of the solvation process. It must be true that PII-like conformations possess the appropriate distribution of intramolecular voids to accommodate solvent molecules. If so, the success of our model lies in the general validity of the good-solvent inverse-power potential which strives to minimize intrachain packing density and promote the creation of intra-molecular voids.
Comparison with results from other simulations
Free-energy surfaces for the alanine dipeptide have been the subject of numerous independent investigations based on all atom force-fields, both in vacuo and in the presence of solvent (Rossky et al. 1979; Anderson and Hermans 1988; Pettitt and Karplus 1988; Roterman et al. 1989; Schmidt and Fine 1994; Apostolakis et al. 1999; O' Connell et al. 1999; Smith 1999; Tobias and Brooks 1999). With the exception of Caflisch and coworkers (Apostolakis et al. 1999), who find the global free-energy minimum to be at (
,
)
(-75°, 136°) for the dipeptide in water, almost all simulations converge on the global minimum being either in the ß-region (Anderson and Hermans 1988; O'Connell et al. 1999;) or in the vicinity of the right-handed
-helix (
,
)
(-72°, -56°; Smith 1999). In the work of Tobias and Brooks (1992), the global minimum is in the vicinity of PII, (
,
)
(-80°, 120°). Smith (1999) finds closer agreement with the results of Apostolakis et al. (1999) and those of Tobias and Brooks (1992) using either a continuum dielectric or a Poisson-Boltzmann approach to embed the alanine dipeptide solute in a solvent continuum. In summary, many of the force-field approaches capture the general trend toward PII, although none are unequivocal about this preference, and numerous disagreements exist between the different force-field calculations. Smith's work (Smith 1999) provides an excellent summary of results from different calculations.
Mu and Stock (2002) performed two 20ns simulations at T = 300K and T = 350K for trialanine in water. This is the molecule studied by Woutersen and Hamm (2000) using vibrational spectroscopy. Mu and Stock found that the trialanine molecule spends roughly 80% of the time in the vicinity of ß-strand (
-122° and 
130°) and PII (
-67° and 
132°) conformers.
Calculating the free-energy surface based on the inherent-structure formalism is relatively straightforward. The free-energy difference between a basin surrounding minimum
and the global minimum PII basin at temperature T is:
A
o = -RT{ln(P
) - ln(Po)}, where P
P
(T) is the occupation probability of basin
, Po
Po(T) is the occupation probability of the global minimum basin, and
A
o is the difference in Helmholtz free energy. The basin occupation probabilities for the five low-energy basins are plotted as a function of temperature in Figure 7a
. The results in this figure were used to calculate the Helmholtz free-energies of the different basins relative to the PII basin, as shown in Figure 7b
. At low temperatures, the dipeptide prefers to fluctuate in the immediate vicinity of the PII conformation. From Figure 7b
it is clear that the dipeptide freely fluctuates around the PII and sPII minima, and the two basins are essentially isoenergetic for all temperatures greater than 100K. General agreement with detailed free energy calculations is encouraging, and we believe our model provides a simple interpretation of results seen using detailed all atom force-fields.
There are only two adjustable parameters in our model: the exponent for inverse power potentials (n) and values for the hard sphere radii (
; see Methods). The latter are dictated by stereochemical considerations (Ramachandran and Sasisekharan 1968). Consequently n is the only adjustable parameter. As the exponent n in (
ij/rij)n is lowered, inverse-power potentials mimic interactions between softer spheres. Increase of n beyond 12 results in interactions between harder spheres. Table 3
summarizes the location of local minima, the number of local minima and the energies relative to the global minimum for n = 9 and 14. As n decreases, a pair of minima connected by a saddle point shift toward each other, and the relative energies and the energy at the barrier decrease until a single minimum will result from coalescence of basins due to the elimination of the barrier. The relative energy differences between existing minima increases. As n increases beyond 12, all allowed conformers become isoenergetic and their total inverse-power potential energy approaches zero. Conversely, the high-energy conformers also become isoenergetic and the inverse-power potential energy approaches infinity, consistent with the fact that for a hard sphere potential, conformers with steric overlap are disallowed.
View this table:
[in this window]
[in a new window]
|
Table 3. Location and energies of alanine-dipeptide local minima for different values of the inverse-power potential exponent (n). These values are to be compared to those shown in Table 1 for n = 12
|
|
The inverse power potentials used here are similar to the modified hindrance potentials of Flory and coworkers (see eqs. 4 and 7 in Brant et al. 1967). The low-energy contours and the location of the global minimum shown in Figure 2
and those from the hindrance potentials of Brant et al. (1967) are essentially identical. The asymmetric distribution of distances in Cartesian space leads to two local minima and a well defined saddle point in the upper left quadrant for the inverse power potential.
Results for longer chains
A polyalanine 7-mer was studied at two temperatures, 100K and 300K. The 7-mer was chosen to facilitate direct comparison with the recent experimental work of Shi et al. (2002). The 7-mer has 14 independently rotatable bonds, so it is impossible to map the multidimensional energy landscape exhaustively. However, as noted earlier, only five of the ten conformational basins are required to account for 99.9% of the configurational sum for the alanine dipeptide over a wide range of temperatures. To make the enumeration of 7-mer conformers tractable, sampling was restricted to just the five basins shown in Figure 6a
, that is, 57 distinct pockets in the 14-dimensional space of interest.
The restrictions imposed on the sampling are robust and do not lead to a neglect of thermodynamically significant conformers. This is a direct consequence of choosing purely repulsive potentials: energies of longer chains are either a strict sum of dipeptide energies, or increase drastically if the combination of (
,
)-values leads to longer-range steric overlaps (Pappu et al. 2000). The latter is true of conformers that are a combination of (
,
)-values from all but the two dipeptide basins shown in blue and in red in Figure 6
. A stratified Monte Carlo importance sampling (see Methods) was used to generate a total of 5x109 independent conformers for the two temperatures.
Configurational mapping allows us to analyze conformational preferences on a 14-dimensional energy landscape. For each of the 5x109 sampled conformers, the two-step energy minimization described for the dipeptide was used to identify its inherent structure (local minimum). The set of conformers that map to a given inherent structure comprise its basin. Once all inherent structures and basins were identified, the likelihood (P
) of finding the system within a basin (
) is calculated.
Results from configurational mapping
We identified 16,317 unique inherent structures at 100K and 78,103 unique inherent structures at 300K. The distribution of inherent structure energies at 300K, a superset of the 100K case, is shown in Figure 8
. There are only four significant dipeptide basins at T = 100K, which explains the smaller number of basins for the 7-mer at this temperature. The global minimum is a regular PII helix, with
,
-angles: {
1,
1 = -81.8°, 146.4°;
i,
i = -82.0°, 146.0°;
7,
7 = -82.0°, 146.3°}, i = 2, . . . ,6.
As noted earlier, the five relevant alanine dipeptide minima are p (
-81°, 
147°), e (
-147°, 
81°), r (
-79°, 
-49°), i (
-149°, 
-58°), and l (
53°, 
63°). On the 14-dimensional landscape, the inverse-power potential energy is the sum of dipeptide energies when (
,
)-pairs sample values from the p or e dipeptide basins (Fig. 6
), or when no two consecutive (
,
)-pairs are from high-energy dipeptide basins, r, i, and l (Fig. 6
). Such conformers are isoenergetic, and consequently the energies of several inherent structures (U
) and associated basin weights (P
) are identical. The free-energy difference between a basin
and the global minimum PII basin at temperature T is:
A
o = -RT{ln(P
) - ln(Po)}, where P
P
(|gT) is the occupation probability of basin
, Po
Po(T) is the occupation probability of the global minimum basin, and
A
o is the difference in Helmholtz free energy.
As an example, consider two inherent structures,
and
, with
,
-angles: (
i,
i = -82.0°, 146.0°,
7,
7 = -147.5°, 81.6°, i = 1, . . ,6) and (
i,
i = -82.0°, 146.0°, i = 1,2,3,5,6,7 and
4,
4 = -147.5°, 81.6°), respectively. The two inherent structures
and
are isoenergetic, with U
o = U
o = 6.1 kcal/mole. Additionally, their basin occupation probabilities, P
and P
, are also similar, which leads to similar values for
A
o and
A
o. The set of inherent-structures realized by permutations of six (
,
)-pairs |nb p and one (
,
)-pair |nb e is labeled p6e1, and the label identifies the corresponding free-energy level.
T = 100K
Data for T = 100K are summarized in Table 4
, which lists the inherent structure basins that account for 90% of the sampled population. Free-energy levels are labeled by the number of occurrences of p, e, or r
,
-values in the inherent structure. Column 2 lists the number of degenerate basins at each free-energy level. All basins at a given free-energy level have an equal likelihood of being populated; column 4 gives the population at each of these levels.
There is a dominant cluster of basins, within
A
o/RT
1.8 (Fig. 9a
). This distribution is bracketed between p7 (PII) and e7 (sPII). All inherent structures between p7 and e7 are permutations of
,
-values from these two types of helices. Beyond the dominant cluster (
A
o/RT >1.8 in Fig. 9a
), the population is defined by conformers with one (
,
)-value in the r-basin (centered around 
-81°, 
-49°), primarily p6e0r1, and p5e1r1.
The system is found in the p7 basin in only
1.5% of the ensemble, but thermal fluctuations to higher free-energy levels access similar structures. To illustrate the structural nature of thermal fluctuations, we use the root-mean-squared-distance (RMSD) between superimposed inherent-structures. Data for the 11 lowest free-energy levels are shown in Table 4
, columns 57. Inherent structures from the same free-energy level are structurally similar to each other (column 5), but structurally dissimilar both to inherent structures from other levels and to random conformations. For example, the average RMSD of all unique pairwise combinations of the 21 inherent structures at level 3 (p5e2) is 0.22 ± 0.006 Å. In contrast, the average RMSD is 4.7 ± 1.9 Å between any of these inherent structures and 100 conformers generated from random combinations of (
,
)-values from p, e, r, i, and l dipeptide basins. The RMSD was also computed between each inherent structure and the global minimum PII conformation (Table 4
, column 7). Fluctuations about inherent structures that are structurally similar to the PII helix are lower in free energy and account for a major fraction (
90%) of the equilibrium population. The data from Table 4
(column 7), plotted in Figure 10
, reveal this strong correlation between free energy relative to the PII basin and structural similarity to a PII helix. Clearly the landscape is structured, and is dominated by uncorrelated fluctuations of individual residues around PII -helix (
,
)-values.
T = 300K
An equivalent analysis was performed at a higher temperature, T = 300K. The results are shown in Figure 9c,d
. Comparison of the scales on the abscissa of Figure 9a and c
indicate that the free-energy difference between the global minimum and higher-energy basins decreases with increasing temperature. There is a roughly threefold increase in the number of relevant basins, which is consistent with a temperature-dependent increase in energy fluctuations. Two new clusters of inherent-structure basins emerge (Fig. 9c
). In the second cluster (1.9 
A
o
2.8), one
,
-value is in the r-basin (
-81°, 
-49°), whereas in the third cluster, one
,
-value is in the i-basin (
-149°, 
-58°); remaining
,
-values are from either p or e. Overall, the diminished preference for low-energy PII (p7) basins is offset by the entropic preference for higher-energy basins.
The degeneracy for each free-energy level is shown in Figure 9b,d
. The basins on each free-energy level are isoenergetic, and this promotes uncorrelated fluctuations of individual residues about preferred (
,
)-values. The degeneracy (Fig. 9b,d) emphasizes that conformational entropy is maximized upon unfolding. Dynamic disorder refers to hopping between inherent-structure basins. Equilibrium populations for the 7-mer show clear evidence for dynamic disorder in that the 7-mer is floppy, although individual residues fluctuate about preferred low-energy PII and sPII (
,
)-values. Local preferences give the chain a distinctly nonrandom character.
Making contact with experiment
Our analysis of the 7-mer energy landscape supports the view that individual residues show uncorrelated fluctuations about two preferred dipeptide basins, PII (p) and sPII (e). The quantitative analysis of conformational preferences on the energy landscape goes well beyond the capabilities of typical measurements that report on ensemble-averaged properties. The distribution of inherent-structure weights can be used to compute different properties accessible to experimental measurements. To illustrate this, we used our knowledge of the distribution of Boltzmann weights for conformational basins to calculate scalar coupling constants accessible to a typical NMR experiment.
NMR scalar coupling constants characterize interactions between nuclei separated by a small number of covalent bonds (Wüthrich 1986). The Karplus relation (Karplus 1959) relates values for coupling constants between third nearest neighbors (3J-measured in Hertz) to the dihedral angle about a single rotatable bond. Parameters to extract the relationship between the (
,
)-angles and different 3J coupling constants have been worked out in previous studies (Pardi et al. 1984; Wüthrich 1986; Vuister and Bax 1993; Wang and Bax 1995).
We used a prescription outlined in the Methods section to calculate 3J coupling constants as weighted thermodynamic averages. Two types of averages were computed. The first, referred to as an annealed average, uses (
,
)-values for each residue of the 7-mer from the sampled distribution of 5x109 conformers. In addition, we computed quenched averages that use the (
,
)-values of inherent structures to compute the 3J coupling constants. The two sets of results are summarized in Table 5a
for T = 100K and Table 5b
for T = 300K. The coupling constants 3JHN
, 3JHNCO, 3JHNß, and 3JC(i1)H
(i) report on the ensemble-averaged value of the
-angle for a particular residue and the measured 3JNiH
i1 values report on the ensemble average for
. In the next few paragraphs, we discuss our calculations in the context of the recent measurements of Shi et al. (2002) for the 7-mer.
The temperature dependence of measured coupling constants and those calculated from our sampled equilibrium ensemble cannot be directly compared because our temperature scale is nonphysical. Our model is incomplete as it ignores the details of chain-solvent interactions. Therefore 100K in our calculations does not correspond to 100K in vitro. However, general trends from theory and experiment can be compared as is shown in Tables 5a and 5b
.
The calculated 3JHN
coupling constant is roughly 7.4 Hz for all residues in the 7-mer. As shown in Figure 11
, this value is consistent with ensemble average angles of either
-85° or
-155°. This degeneracy would be referred to as conformational averaging (Smith et al. 1996), and the common assumption would be that all values between
-85° and
-155° are equally likely. Our analysis of basin occupation probabilities (Fig. 9a,c) allows us to correctly interpret the calculated coupling constants as being consistent with thermal fluctuations of each residue about distinct basins centered about PII and sPII.
The calculated 3JHN
coupling constants shown in Tables 5a and 5b
are insensitive to changes in temperature, although the populations for each of the 57 basins change with temperature as shown in Figure 9a,c
. Results for the dipeptide (Figs. 6, 7
) help clarify the origins of the thermal insensitivity of calculated coupling constants. As temperature increases, increased disorder leads to increased population in basins around the sPII minimum (e-basin) and the
R minimum (r-basin). In both cases, the fluctuations lead to values for the 3JHN
coupling constant that are similar to values computed for fluctuations in the PII basin. Therefore the model is not athermal. Instead, the calculated coupling constants are insensitive to changes in temperature because increased disorder leads to fluctuations about basins that result in similar values for the coupling constant. The values for the remaining coupling constants 3JHNCO, 3JHNß, and 3JC(i1)H
(i) that report on the
-angle support the conclusion that the ensemble average is tilted toward PII-like conformers over sPII conformers. Although both basins are prevalent, the PII basin is preferred, and the preference is weak (Table 4
).
Shi et al. (2002) used NMR to study the conformers available to an alanine-based peptide, Acetyl-XXAla7OO-Amide (XAO), in water. Here X is diaminobutyric acid and O is ornithine. XAO may be thought of as a soluble version of a 7-residue polyalanine. Well resolved peaks in the amide region permit the measurement of 3JHN
scalar coupling constants. Shi et al. (2002) calculated the average value of the backbone
-angle at 2°C to be -70° ± 10° for each of the alanine residues. They used the optimized coefficients of Vuister and Bax (1993) to calculate the
-angle from the measured coupling constants. The coefficients of Vuister and Bax are only slightly different from those of Pardi et al. (1984) used in our calculations. We have used both sets of coefficients and find that our results are essentially insensitive to the choice of Karplus-equation coefficients. Shi et al. found that the measured 3JHN
coupling constants increase monotonically with temperature between
5.5 Hz at 2°C and
6.1 Hz at 56°C. The coupling constants measured by Shi et al. are consistent with
-angles of
-70° at 2°C and -75° at 55°C. Shi et al. did not measure the other coupling constants. Based on the ratio of NOEs between nearest neighbor ß-protons, Shi et al. suggest that the
-angles for all alanine residues are likely to be: +145° ± 20°. The average
-angle for PII consistent with the calculated coupling constant is
10° off from the
-angle calculated using experimentally measured coupling constants. However, we have provided ample evidence that the free-energy surface for the inverse-power potentials is dictated by fluctuations about PII and sPII-helices.
Shi et al. interpret the temperature-dependent behavior of measured coupling constants as evidence for an admixture between ß-strand and PII helical structures. It should be noted that their coupling constants are also consistent with
-angles between -165° and -170°, as shown in Figure 11
. These values correspond to the sPII basin (Fig. 6
). Shi et al. did not consider this possibility in interpreting their measurements. It would appear that the sPII conformers can be ruled out because the margin of error in the deduced values for the
-angles is inconsistent with the possibility of fluctuations around sPII.
There are two possible interpretations for the temperature dependence of measured scalar coupling constants. The sPII basin may be unfavorable in water, and therefore increased thermal agitation leads to increased sampling within the basin around PII, which in our model subsumes (
,
)-values consistent with ß-strands that would be part of parallel ß-sheets. The interpretation of Shi et al. would be valid if for our model fluctuations for all residues in the 7-mer are confined to the blue regions shown in Figure 6a,b
. Alternatively, fluctuations at lower temperatures may be centered about PII, whereas at higher temperatures the system fluctuates about both PII and sPII basins. If the latter were true, the measured coupling constants would have to be relatively insensitive to changes in temperature, as was found with calculated coupling constants. The measured coupling constants change by
0.6 Hz, and it is not clear whether this is a significant change given that the ensemble averaged
-angle changes by 5° over this temperature range. Therefore neither alternative can be unequivocally ruled out, although the monotonic increase of measured 3JHN
values may preclude fluctuations about sPII.
Applying the configurational mapping formalism to a system that includes details of chain-solvent interactions should clarify the interpretations of measured coupling constants. It is also conceivable that softening the potential (Table 3
) by choosing smaller values for the exponent of the inverse-power potentials will lead to improved agreement with measured coupling constants. Upon softening the potential, the
-angle shifts away from the canonical values toward the experimentally measured values (Table 3
). The selectivity for extended chain conformers increases with softer potentials and diminishes with hard repulsive potentials.
More importantly, the configurational mapping procedure should allow us to make contact with results from a variety of other measurements including UV circular dichroism (Woody 1992), one-dimensional vibrational absorption spectroscopy (Woutersen and Hamm 2000), two-dimensional infrared polarization spectroscopy (Woutersen and Hamm 2000; Schweitzer-Stenner 2001), and NOE data from two-dimensional NMR experiments (Neuhaus and Williamson 2000). Additional investigations with labeled XAO peptides might permit the measurement of other coupling constants that do not vary steeply for small changes in the
-angle and that have different values for PII, sPII, and
R
-angles. Differences in coupling constants such as 3JCi1H
i may be useful to determine the energy difference between different conformers. Labeled peptides may also provide information regarding the
-angle through the measurement of the 3JNiH
i1 coupling constant. It is clear that one cannot comment authoritatively on conformational preferences in unfolded states based on the value reported by one number, especially if that number happens to be the scalar coupling constant 3JHN
that gives degenerate values for different
-angles. Only by making contact with data from diverse experimental probes can we fully validate the quantitative predictions of our simple model.
Connection to recent theoretical work
Banavar et al. (2002) recently showed that considerations of chain connectivity, compactness, and excluded volume impose geometrical constraints on the conformations accessible to coarse-grained protein-like polymers. The global radius of a chain computed by considering the smallest radii for circles that circumscribe all triplet combinations of C
atoms measures the thickness of the chain in a particular conformation. Thickness is related to the "wiggle room" or "free volume" around a chain, and is a direct measure of conformational flexibility. Banavar et al. (2002) found that secondary structure motifs emerge as a consequence of maximizing the thickness upon chain compaction.
We calculated the chain thickness for each of the enumerated inherent structures of the 7-mer at T = 300K. The distribution of the chain thickness values (Fig. 12
) is bimodal. The linear, uniformly weighted average thickness is 2.73Å. This value is in close agreement with the chain thickness for a 7-mer that populates mixtures of compact and extended conformations, which would be expected for a statistical random-coil. Conversely, the weighted average, calculated using the Boltzmann weights associated with each inherent structure, is 3.48 Å, a value consistent with preferential population of extended conformers for the 7-mer. The ratio of the weighted average to the linear average, a measure of nonrandomness realized by preferring extended conformers, is
0.73. This value is in agreement with that computed by Banavar et al. (2002) for the ratio of the chain thickness of ß-strands or extended conformers to compact
-helices. Maximizing chain thickness under good solvent conditions, akin to minimizing chain packing density, promotes unhindered conformational fluctuations about PII and sPII helices. Conversely, one would expect maximization of chain thickness under poor solvent conditions, akin to maximizing chain packing density, to lead to
-helix or ß-sheet formation. This expectation is consistent with the findings of Banavar et al. (2002).