|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Physics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
2 National Research Laboratory for Computational Proteomics and Biophysics, Department of Physics, Pusan National University, Pusan 609-735, Korea
3 Institute of Physics, Polish Academy of Sciences, 02-668 Warsaw, Poland
4 INFM and Dipatrimento di Fisica G. Galilei, Universitá di Padova, 35131 Padova, Italy
5 The Abdus Salam International Center for Theoretical Physics (ICTP), 34100 Trieste, Italy
Reprint requests to: Marek Cieplak, Institute of Physics, Polish Academy of Sciences, 02-668 Warsaw, Poland; e-mail: mc{at}ifpan.edu.pl; fax: 48-22-843-0926.
(RECEIVED March 1, 2004; FINAL REVISION May 7, 2004; ACCEPTED May 14, 2004)
| Abstract |
|---|
|
|
|---|
values determined in protein engineering experiments. We benchmark our findings to various theoretical approaches proposed in the literature for the identification and characterization of the transition state. Keywords: protein folding; transition state; protein engineering
Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.04713804.
| Introduction |
|---|
|
|
|---|
, the transition state; and N. The transition state corresponds to the highest free-energy barrier and provides a bottleneck for the conversion to the native state.
The phenomenological two-state picture raises many questions when one considers the molecular structure of a protein. For instance, there is a huge number of conformations that the protein may adoptwhich of these ought to be classified as
, or D? Do the other conformations matter? What is the meaning of the reaction coordinate? Because the transition state must be short-lived and barely populated, how can one find it experimentally or elucidate it theoretically?
One way to deal with the multiplicity of the microscopic conformations is to view the folding phenomenon as being akin to a first-order phase transition (albeit in a finite system) with its kinetic mechanism being similar to nucleation (Abkevich et al. 1994; Fersht 1997). The idea of the transition state morphs then into that of a folding nucleus, which acts as a critically sized droplet of the folded phase. The criticality condition means that the droplet may either shrink (which leads to unfolding) or expand (which leads to folding) with equal probability; that is, the droplet is on the edge between the folded and unfolded basins of attraction. The nucleation interpretation immediately suggests that there could be many different "droplets" that form an ensemble of the transition states (Pande et al. 1998; Pande and Rokhsar 1999a, b). Is this suggestion valid?
The established experimental way to probe the transition state or states is through the techniques of protein engineering (Oxender and Fox 1987; Robson and Garnier 1988; Matouschek et al. 1989, 1990; Jackson and Fersht 1991; Otzen et al. 1994; Itzhaki et al. 1995; Cleland and Craik 1996; Fersht 1998; Carmichael Wallace 1999). The basic idea entails the substitution of amino acids in different positions of a protein with other amino acids and monitoring the resulting changes in the stability of the native state and the kinetics of folding or unfolding. The effects of these substitutions are characterized by means of a set of the folding
values (
f), which are measures of the changes in the kinetic rates normalized by corresponding changes in the protein stability. In simple situations, the
fs take values between 0 and 1. A value that is close to 1 suggests a nearly native-like structure of the site of substitution in the transition state. Therefore, new questions emergefor instance, how may one identify conformations that are compatible with the measured
values? Furthermore, how may one interpret nonclassical values that are negative or >1?
The list of such basic and unsolved questions is long and so is the list of different answers that have been offered in the literature. This situation calls for considering a simple model that displays two-state physics and is amenable to exact solution, through which one may resolve the key issues and elucidate the underlying concepts. In this paper, we analyze a model that encapsulates many of the essential features of protein folding kinetics with a nontrivial free-energy landscape. This model is a variant of a system considered by Munoz, Eaton, and their collaborators (Munoz et al. 1997, 1998; Munoz and Eaton 1999). It is Go-like (Abe and Go 1981), and it embodies the topology of the
-hairpin. It was introduced in the context of experimental studies of a corresponding fragment in the protein G (Munoz et al. 1997). Munoz et al. (1997) have considered a peptide of 16 residues with one tryptophan (W43) to investigate the kinetics of
-hairpin formation in a laser-induced temperature jump experiment. Measurements of the tryptophan fluorescence have indicated that the relaxation to equilibrium is governed by a single exponential and corresponds to a single free-energy barrier. The time constant is ~6 µsec, which is about 30 times longer than that found in comparable
-helices. Its equilibrium properties have been further explored theoretically by Flammini et al. (2002) and Bruscolini and Pelizzola (2002). We consider a shorter, 12-amino acid version of the original model; reformulate it in terms of Ising spins, which can take on one of two values, corresponding to the immediate vicinity of the protein being native-like or not; and endow it with single spin flip kineticsMunoz and Eaton had considered diffusional kinetics instead. The kinetics is formulated in terms of a Master Equation that deals with probabilities and not specific trajectories. We then go on to use the results of our exact solution of the model to understand the nature of the transition state and the significance of
values.
| Results |
|---|
|
|
|---|
angles taking on their native-state values. This binary character of the bond placement allows for an Ising-like modeling, and we adopt spin variables Sn, which take values 1 or 0 correspondingly. The free energies per mole can be written as:
|
![]() | (1) |
A nonzero value of the product SlSl + 1 ... Sm implies that all peptide bonds between l and m are set in the native fashion, which allows for the establishment of native interactions in the cluster between the bonds l and m. These interactions are either hydrophobic or due to establishment of the hydrogen-bonds, or both. For simplicity, we assume that the strength of the interactions, J, is the same in both cases and equal to 1000 K, whereas the conformational entropy per spin,
Sconf, is taken to be 2.14R, where R is the gas constant. In equation 1, T denotes the temperature. Therefore, the stability of the
-hairpin system is controlled by a competition between the gain in the energy of the established contacts and the loss of conformational entropy on setting the conformational angles to their native values.
We choose
lm to be 2 for (l, m) = (1, 11) and (3, 9), 1 for (l, m) = (2, 10), (4, 8), (5, 7), and (1, 9), and 0 otherwise. Note that the placement of the contacts breaks the symmetry between the upper and lower branches of the hairpin. There are several reasons that we consider the simpler 12-residue system. First, the number of conformations is significantly smaller, which facilitates the computational study. Second, the model is simplified by choosing just one interaction parameter. Third, we remove an unnecessary complication of the original model that is related to the fact that two residues at each of the terminal ends of the hairpin are not stabilized by the interactions present in the 16-residue system. In fact, the hairpin conformation is not a free-energy minimum for either the original or the simplified couplings. Let the free energy in conformation i be denoted by Gi. The equilibrium probability to occupy this conformation, Pi, is then given by
![]() | (2) |
Kinetics: relaxation, folding, and unfolding
The relaxational spin-flip kinetics can be described in terms of the Master Equation (e.g., see Cieplak et al. 1998; Ozkan et al. 2002) for the time-dependent vector of probabilities P with components Pi. By convention, we take i = 1 to correspond to the native state. The Master Equation reads
![]() | (3) |
with Mij for the flip from state j to i equal to
if Gi < Gj and
otherwise.
is the attempt rate, which may generally depend on T. The diagonal elements are set so that the sum of the terms in each column is zero. This choice of the matrix is consistent with the detailed balance condition and the Arrhenius form of the low-T relaxation processes. The time evolution of P can be obtained through an iterative use of the equation P(t +
t) = (1
tM)P(t), where
t denotes an infinitesimal time increment. An alternative way to follow the kinetics is by decomposing P into the right-handed eigenvectors and by endowing them with an exponential time dependence of the form exp(
t), where 
is the eigenvalues of the M matrix. One eigenvalue is always zeroit corresponds to the system staying in equilibrium. The smallest nonzero eigenvalue, denoted as k, is the slowest relaxation rate. The inverse of k yields the longest relaxation time. Other eigenvalues correspond to faster processes. The two-state behavior is obtained when there is a substantial separation between the slowest and other rates. Such is, indeed, the case here because our choice of the parameters yields the second longest relaxation time at 300 K to be of order 6% of the longest one. Folding conditions are generated when one disallows all transitions that lead out of the native statethe first column of the M matrix is set equal to zero, and the native state acts as the probability sink. The resulting matrix will be denoted by Mf. On the other hand, the unfolding conditions are generated by making the completely unfolded state a probability sink, and the corresponding column is set equal to zero to obtain the matrix Mu. In these cases, the smallest nonzero eigenvalue corresponds to the slowest folding and unfolding rates, denoted as kf and ku, respectively.
The free-energy levels in our model depend on temperature. However, to identify kinetic barriers, it is useful to freeze the levels at their 300 K values and to introduce another fictitious temperature, T', that can be varied at will. In particular, it can be set to zero to identify characteristic times that diverge in this limit. We find that the eigenvalues (the inverse relaxation times) either tend to zero as T' approaches zero (there are four such eigenvalues with eigenvectors corresponding to the four local minima) or they tend to integer values of 1, 2, or 3. The other eigenvalues correspond to downhill motion in the free-energy landscape and are determined by the local topology. As T' increases, there are Arrhenius-like corrections to the T' = 0 limit and considerable mixing of the levels.
Before we continue with the discussion of our model, we note that a strictly two-level system would be described by the following 2 x 2 matrices:
![]() | (4) |
The transition state is implicit kf and ku satisfy
![]() | (5) |
where
Gf
= G
GD and
Gu
= G
GN. Each of the M matrices has one zero eigenvalue, and the other eigen-values are k = kf + ku, kf, and ku for the relaxation, folding, and unfolding situations, respectively, which agrees with the standard expectation (Fersht 1998). The eigenvector corresponding to the nonzero eigenvalue is, in each case, equal to
. Thus, the observation of Ozkan et al.(2002) that the populations corresponding to the relaxation eigenvector of the lowest nonzero eigenvalue are "rigorously what should be called transition state conformations" is not valid. In the two-level system, the eigenvector contains both the D and the N conformations, albeit with opposite sign. One can show exactly that, quite generally, the sum of the components of the eigenvector is 0 and corresponds to a draining out of the probability of occupancies of conformations with a given sign in the eigenvector accompanied by an associated increase in the probabilities of the remaining conformations.
In the 12-amino acid model, there are 2048 possible conformations. The native state corresponds to all spins being equal to 1 and to the establishment of all eight contacts. The fully unfolded state corresponds to all spins being 0 and no contacts. Of these 2048 conformations, 67 have the property that nonzero values of the spins are contiguous, that is, form a single sequence of ones. Indeed, the 67th conformation is the unfolded state in which all the spins have zero values. In the so-called single sequence approximation (Munoz et al. 1998), one restricts the conformation space to just these 67 states. Figure 2
shows that the single sequence approximation gives a fairly accurate picture of the thermodynamic quantities. For the parameters chosen, the folding temperature, Tf, is ~300 K, both exactly and in the single sequence approximation, and this is the temperature at which we focus our further studies. At Tf, the probability to occupy the native state is ~1/2, and the specific heat and fluctuations in the fraction of the established native contacts, Q, show a maximum.
|
|
1 and P67
0), whereas all bottom states produce a flow to state 67. There are six states "on the edge" (5, 16, 25, 33, 40, and 39, shown connected by a line in the figure), which show nearly equal propensities for both directions, and they separate the two regions of behavior. Two of the edge states, 25 and 33, are "confused" the most, and they also have the lowest free energy among the six. Strikingly, none of the edge states is a maximum.
The bottom panel of Figure 3
illustrates the best paths that allow for the optimal pathway between states 1 and 67, in either direction. They correspond to the states marked by stars within the circles. There are 48 such paths because on several horizontal lines of the figure, there are several states from which to choose. These choices are equienergetical, making the optimal path energetically unique. The corresponding free energies and the numbers of the native contacts formed are shown in the right of the bottom panel. The native state corresponds to the free energy of 938 K. The highest barrier to climb on the best trajectories is 1852 K, and it corresponds to populating two degenerate states: 25 and 33, which are the saddle point states. There are two native contacts that are established in these states: between bonds (or spins) 57 and 48, that is, S4 = S5 = S6 = S7 = S8 = 1. In state 25, S3 is nonzero, whereas in state 33 it is S9, which is nonzero. Thus, these two of the edge states are the "transition states" and are shown in the figure as the black circles. The identification of states 25 and 33 as transition states comes also as a result of studies of sensitivities of kf and ku to changes in the free-energy values of individual conformations and noting that the influence of such perturbations is largest in the transition states. This large sensitivity is due to the fact that the transition states act as bottlenecks for the folding and unfolding kinetics. When one considers the full 2048-level description, there are 11! = 39,916,800 different directed paths from (00000000000) to (11111111111). Among these, there are 432 optimal trajectories that include the 48 identified in the smaller subset of statesthe transition states are the same. In the full set, there are four other conformations having the same energy as the transition states, for example, (10011111000), but the optimal directed paths do not encounter these conformations.
The reaction coordinate for the folding transition consists of a list of conformations that are traveled on a directed optimal trajectory. The free energy plotted against this reaction coordinate is shown in Figure 4
(bottom panel). It indicates states 25 and 33 as the transition states. This plot is quite distinct from the free energy, G(Q), calculated as a function of the fraction of the native contacts. As seen in the top panel of Figure 4
, G(Q) has a maximum for Q = 1/4, which corresponds to seven states, but only two of them, 25 and 33, are actually the transition states as obtained through the studies of the kinetic connectivities. We note that the choice of Q as a reaction coordinate has been made, for instance, by Munoz et al. (1998), Clementi et al. (2000), and Shea et al. (2000). They considered Go and non-Go off-lattice models with strong dihedral angle terms in the potential energy. These models exhibit a double minima structure in the free energy when plotted against energy or the fraction of the native contacts that are established during the time evolution of molecular dynamics simulations. They assumed that states contributing to the maximum separating the two minima (i.e., those that are "half-way" in terms of the number of contacts) form the transition state ensemble.
|
|
Time evolution of the probabilities
Our framework provides a straightforward mechanism for monitoring the temporal evolution of the probabilities of the protein to be in a given conformation and average values of physical quantities in terms of linear combinations of the eigenvectors of the M matrix. The smallest nonzero eigenvalues describe the long time behavior. The combined effect of all eigenvectors, at any time, can be assessed from the full time evolution of P. Figure 6
shows the evolution of P1 and P67 in the 67-level system under the conditions of folding, unfolding, and relaxation. The plots for folding and unfolding are not symmetric: the occupation of the unfolded state disappears much more rapidly on folding than of the native state on unfolding. This is because there are many more ways to exit from state 67 compared with just two ways to exit from state 1, leading to much smaller contributions from state 1 to the eigenvectors corresponding to large eigenvalues (or short times).
|
|
|
|


![]() | (6) |
and similarly
where K is the equilibrium constant, which in the two-state model is given by:
![]() | (7) |
The two-state picture holds if m = mf + mu. At Tf, K should be 1, and this is nearly the case if we count not only state 67 but also all of the last but one row states of the triangle of Figure 3
as belonging to the coarse-grained denatured state. Another quantity of interest is the parameter 
, defined as:
![]() | (8) |
Let us postulate a linear effect of the denaturants concentration on the free energies so that GN(x) = GN + yNx and G
(x) = G
+ y
x, where Gi (i = N,
, D) on the right-hand sides of the equations denotes the x = 0 values of the free energy; the denatured state is expected to be unaffected by x. In this case, 
= y
/yN, that is, 
does not depend on x. This expression for 
indicates that this quantity measures the amount of the native-state-like structure contained in the transition state, which, in turn, suggests the common interpretation that it is related to the amount of the buried surface area. There are proteins, however, in which 
shows a linear variation with x. A varying 
would then mean that either the free energy of the transition state varies with x in a way unrelated to the free-energy changes in the native state (e.g., because of the presence of nonnative contacts in the transition state) or that the identity of the transition state varies with x. In the latter case, the adjustments of the free-energy landscape can be captured by a "movie" (Oliveberg et al. 1995; Ternstroem et al. 1999; Oliveberg 2001). In our model, the transition state remains the same, that is, it does not "move" when x changes between 0.25 and 0.25, as shown in Figure 4
, and yet 
varies. The bottom panel of Figure 10
shows that the dependence is nearly linear. The slopes in the 67- and 2048-level systems are almost the same. It is only in the limit of four states that 
is constant, and equal to 1/4. If all states are included, the Chevron branches acquire curvature (see Fig. 8
), and 
is merely a measure of the curvature generated by the presence of states that are not present in the two-state picture.
|
values
values in our model, at x = 0, we consider a small local adjustment in J at the location of a given amino acid. The adjustment is taken to be of order 5%. The
values are practically independent of the magnitude of adjustment between 1% and 5%. There are 12 possible locations that are either at a joint between two bonds (two spins) or at the end points of the system. Note that various amino acid locations correspond to different numbers of bonds that are affected. For instance, the ninth amino acid belongs to bonds S8 and S9 (see Fig. 1
kf and
ku, respectively, which allows one to calculate
![]() | (9) |
and
![]() | (10) |
for a mutation at any of the 12 amino acid sites. Note that the folding and unfolding
values satisfy the condition
f +
u = 1.
The two-state picture interprets the
values in terms of changes in the Gibbs free energy of the folded state and the transition state brought about by the mutation. Specifically, using equation 5,
![]() | (11) |
and
![]() | (12) |
where the symbol
indicates a change in, say,
G = GN GD relative to the respective wild-type value. The two-state picture is obtained when one restricts the conformation space to just four levels: 1, 25, 33, and 67. The set of the corresponding
f values is shown in Figure 10
as asterisks and marked as 4-state. They are equal to 1 at sites 5, 6, 7, and 8 (between bonds S4 and S8); equal to 1/3 at site 4; to 1/4 at sites 9; and to 0 at the remaining end sites. This pattern is consistent with the structure of states 25 and 33. When we consider 67 levels, the sites near the turn still have high
f values, but they become reduced to ~0.8. At the same time, the values near the end points are enhanced, and only the very end points continue to have strictly vanishing
values. The pattern of the
f values gets a small shift when the full set of 2048 states is considered. It should be noted that the
values depend on T and on other modifications in the free-energy landscape such as a lowering of one of the free-energy minima.
| Discussion |
|---|
|
|
|---|

does not indicate a moving transition state. The free-energy landscape of our model is not endowed with a funnel (Onuchic et al. 1995), and yet it provides for fast folding. Whether the landscape incorporates a funnel or not, one would expect that the transition states are akin to saddle points with very low occupational probabilities. Such states ought to be hard to spot through simulations.
It should be pointed out that studies of the so-called disconnectivity graphs for the polyalanines (Becker and Karplus 1997; Dobson et al. 1998; Levy et al. 2001) also do not yield a funnel-like landscape and suggest instead that the conformational space should be visualized as a broad basin with several pronounced minima at its bottom. The disconnectivity graphs constructed by Wales et al. (2000) for various protein-like systems are endowed with many "transition states." These, however, are defined as saddle point conformations separating two arbitrary local energy minima. One of these saddle points should correspond to the transition state of Eyring, but all others are not expected to be relevant kinetically.
The issue of multiplicity of folding nuclei
A multiplicity of distinct transition states or critical droplets is also implied by the nucleation-condensation picture of folding (Abkevich et al. 1994; Fersht 1997) and the neoclassical approach of Pande, Rokhsar, and their collaborators (Pande et al. 1998; Pande and Rokhsar 1999a, b). In practice, the droplets were identified as the edge conformations such that time evolution leads to folding and unfolding with equal probabilities. In lattice models, these probabilities are calculated by determining the fate of enumerated short Monte Carlo trajectories that originate from the conformations. Our calculations show that only the lowest free-energy edge states are transition states. Pande and Rokhsar have also studied off-lattice models through all-atom molecular dynamics simulations in unfolding trajectories. In particular, they considered the
-hairpin system of protein G (Pande and Rokhsar 1999b; related studies were done in Dokholyan et al. 2000 and Ding et al. 2002) and identified four characteristic stagesor clusters of conformationsdenoted consecutively as F, H, S, and U. They identified conformations (regions of values of the radius of gyrations and of Q), which correspond to the edge states separating F and H and similarly the edge states separating S and U. Both are treated as independent transition states without a comparison of their free energies and without a determination of the edge region between H and S. The edge region between H and S may actually correspond to the highest energy, and if so it would correspond to the true transition state provided the paths that go through the stages F-H-S-U are close to being optimal. The procedure of determining "transition states" for pairs of certain stages may not be always correct, because the problem of the optimal path is global in nature and partitioning it into subtasks may work only as an approximation. We should also point out that their procedure identifies the hydrophobic cluster (in our model, spins 1, 2, 3, 9, 10, and 11) as folding first and the turn region as folding last. This does not agree with either the original interpretation of the experiment (Munoz et al. 1997, 1998) or with the structure of the transition state found in our model. It is interesting to note, however, that an all-atom multicanonical Monte Carlo simulation with implicit solvation effects performed by Dinner et al. (1999) suggests that the folding does, indeed, start at the hydrophobic cluster. Furthermore, the folding rate is found to be dominated by the time scale of interconversion between compact conformations. Although the experiment (Munoz et al. 1997, 1998) dos not exclude this folding scenario, the additional experiments and simulations may yield a more complete understanding of the folding kinetics in the
-hairpin. Our model is not meant to generate a realistic picture of the hairpin but is meant to merely provide an illustration of the concepts.
Transition state through abrupt changes in the structure
The picture of multiple folding nuclei has been also advocated by Klimov and Thirumalai (2001). They also argued that these nuclei should contain nonnative contacts. Our analysis does not allow us to draw any conclusions about the role of the nonnative contacts because they are not addressable in the present model. Their method of identification of the folding nucleus is based on sudden changes in structure in the very last stages of folding, that is, when the time evolution ought to be entirely governed by the eigenvector corresponding to the smallest folding eigenvalue, which has very little weight in the transition state. Note that there are no sudden changes in properly averaged time-dependent observables, as evidenced in our model by Figures 6
and 7
. In particular, the probabilities to establish contacts are given by curves that are smooth and monotonic. Thus, any abrupt features should be either due to the presence of intermediates (i.e., be outside of the two-state picture) or be due to insufficient averaging. If one trajectory shows an abrupt structural change at one point, there must be other trajectories that would have abruptness at other points so that a many trajectory average is smooth.
A similar criticism applies to the molecular-dynamics-based identification of the transition state (Li and Daggett 1994; Kazmirski et al. 2001). The operational definition of the transition states is given "as the ensemble of structures populated immediately prior to the onset of a large structural change" during unfolding. Note that all sufficiently averaged quantities should be smooth functions of time, as discussed above. Thus, any method based merely on abrupt changes in the structure probably cannot identify the transition state. Furthermore, it should be noted that unfolding simulations typically impose unfolding conditions through an application of a high temperature (above 200°C) and sometimes high pressure. Both of these circumstances are expected to alter the free-energy landscape significantlypossibly beyond any meaningful comparison with the experiment.
The reaction coordinate and eigenvectors
We have already mentioned the attempts to link the transition state to the eigenvalue analysis (Ozkan et al. 2001, 2002) of the Master Equation. They argue that the eigenvector corresponding to the lowest eigenvalue of the relaxational M matrix can be interpreted as providing a reaction coordinate and a selection of the transition state. We find in our model that the relaxational eigenvector is a linear combination of essentially all 67 conformations, and the true transition state is the 12th weakest weight state.
Selection of the transition state based on the
values
An entirely different way to determine the transition state is generated by a computational exploration of the conformations of a protein followed by an attempt to match them with experimentally determined
f values. If the models are off-lattice, then the procedure involves some clustering of conformations. Examples of this approach are in papers by Vendruscolo et al. (2001) and Paci et al. (2003). The assumed connection of a conformation with the
f values is through the degree of nativeness,
i, of the local structure. This degree is defined by the number of established native contacts that are linked to the mutated amino acid divided by the maximal native number. The calculated values of
i are then compared with the experimental values
i, which are defined as
f at site i. The transition-state conformations are assumed to be those that minimize the distance between
i and
i. Paci et al. (2003) have found a dynamical way of generating the best conformations of this sort by running a simulation that punishes the departures from the experimental values of
i.
It is easy to test this approach in the 67-level model. We determine the
i values and compare them with the
i obtained through the Master Equation approach. We find that there are seven conformations that have the smallest and identical Euclidean distance of 0.636 from the kinetically derived values. In addition to the two transitions states 25 and 35, these are states 4, 15, 31, 32, and 34 defined as (11111111000), (01111111000), (00011111111), (00011111110), and (00011111000). These seven conformations form a V-shape in the diagram of the states shown in Figure 3
. All of them have the
values given by (0 0 0
1 1 1 1
0 0 0). Our conclusion is that even though the
-value-based method does succeed in finding the transition states, it also finds many other spurious conformations. We conclude that this approach is not fool-proof if it is not followed by some additional selection of the states. The need for additional criteria for the selection of transition-state conformations was highlighted by Vendruscolo et al. (2001), who used the
Tanford analysis for this purpose.
Nonclassical
values
We now consider the issue of the nonclassical, that is, negative or >1, values of
. Ozkan et al. (2001, 2002) argue that the folding pathways have a different character away from the native state, where there is a multiplicity of parallel "routes downhill," and near the native state, where folding is slow and serial-like. They postulate that the transition state is located near the place where there is a change in the network topology, and it acts as a switch for the flows of probability. They consider a specific model that is assumed to have two main channels for the flow and suggest that mutations may destabilize one, say slow, channel and direct more flow to another channel. This picture allows them to argue that the
values are measures of the acceleration/deceleration of folding resulting from the mutations. Their model yields nonclassical values of
.
Consider a folding rate that is a sum of two independent, parallel processes (i.e., of the probability flow through two channels): kf = kf1 + kf2 and similarly ku = ku1 + ku2. We assume that the single channel folding and unfolding rates are described as in equation 5 but with the individual barrier heights
Gfi
and
Gu
(
= 1, 2). Suppose that a mutation shifts the native-state free energy by g so that
![]() | (13) |
where the superscript 0 indicates the wild-type value. We assume that the mutation does not affect the free energy of the denatured state, GD = GD0, whereas the individual transition-state free energies get shifted in proportion to g. Thus
![]() | (14) |
where µ
are the coefficients of proportionality. Note that equation 9 can be rewritten as
In the two-channel case,
![]() | (15) |
Note that the coefficients µ
are expected to be <1 and positive, which means that
is negative and thus
f can-not exceed 1. A possibility that nonclassical values of
would arise is if the coefficients have opposite signs. This could arise naturally when the transition state has nonnative contacts, as noted by Li et al. (2000). In the case of the three-state barnase, the nonnative contacts have been revealed through protein substitution studies (Matouschek et al. 1992; Tissot et al. 1996; Dalby et al. 1998) as arising in a long-lasting intermediate state.
Conclusions
Our benchmarking of various methods to determine the transition state in the exactly solvable model indicates that the most practical method entails using the experimental values of
combined with kinetic simulations to determine the set of conformations that are both the most compatible with the
values and are edge states. A further refinement would entail picking conformations with the lowest free energy from this predetermined set. Such a refinement is probably less necessary for a large protein with multiple constraints imposed by the
values. It is possible that for sufficiently large proteins, compatibility with the kinetic simulations may already select the correct state.
| Acknowledgments |
|---|
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| References |
|---|
|
|
|---|
Abkevich, V.I., Gutin, A.M., and Shakhnovich, E.I. 1994. Specific nucleus as the transition state for protein folding: Evidence from the lattice model. Biochemistry 33: 1002610036.[CrossRef][Medline]
Anfinsen, C. 1973. Principles that govern the folding of protein chains. Science 181: 223230.
Baldwin, R.L. and Rose, G.D. 1999a. Is protein folding hierarchic? I. Local structure and peptide folding. Trends Biochem. Sci. 24: 2633.[CrossRef][Medline]
. 1999b. Is protein folding hierarchic? II. Folding intermediates and transition states. Trends Biochem. Sci. 24: 7783.[CrossRef][Medline]
Becker, O.M. and Karplus, M. 1997. The topology of multidimensional potential energy surfaces: Theory and application to peptide structure and kinetics. J. Chem. Phys. 106: 14951517.[CrossRef]
Bruscolini, P. and Pelizzola, A. 2002. Exact solution of the Munoz-Eaton model for protein folding. Phys. Rev. Lett. 88: 258101.[CrossRef][Medline]
Carmichael Wallace, J.A., ed. 1999. Protein engineering by semisynthesis. CRC Press, New York.
Chan, H.S., and Dill, K.A. 1998. Protein folding in the landscape perspective: Chevron plots and non-Arrhenius kinetics. Proteins 30: 233.[CrossRef][Medline]
Cieplak, M., Henkel, M., Karbowski, J., and Banavar, J.R. 1998. Master equation approach to protein folding and kinetic traps. Phys. Rev. Lett. 80: 36543657.[CrossRef]
Cleland, J.L. and Craik, C.S., eds. 1996. Protein engineering: Principles and practice. Wiley-Liss, New York.
Clementi, C., Nymeyer, H., and Onuchic, J.N. 2000. Topological and energetic factors: What determines the structural details of the transition state ensemble and "enroute" intermediates for protein folding? An investigation for small globular proteins. J. Mol. Biol. 298: 937953.[CrossRef][Medline]
Dalby, P.A., Oliveberg, M., and Fersht, A.R. 1998. Folding intermediates of wild-type and mutants of barnase. I. Use of
-value analysis and m values to probe the cooperative nature of the folding pre-equilibrium. J. Mol. Biol. 276: 625646.[CrossRef][Medline]
Ding, F., Dokholyan, N.V., Buldyrev, S.V., Stanley, H.E., and Shakhnovich, E.I. 2002. Direct molecular dynamics observation of protein folding transition state ensemble. Biophys. J. 83: 35253532.
Dinner, A., Lazardis, T., and Karplus, M. 1999. Understanding
-hairpin formation. Proc. Natl. Acad. Sci. 96: 90689073.
Dobson, C.M., Sali, A., and Karplus, M. 1998. Protein folding: A perspective from theory and experiment. Angew. Chem. Int. Ed. 37: 868893.[CrossRef]
Dokholyan, N.V., Buldyrev, S.V., Stanley, H.E., and Shakhnovich, E.I. 2000. Identifying the protein folding nucleus using molecular dynamics simulations. J. Mol. Biol. 296: 11831188.[CrossRef][Medline]
Eyring, H. and Stern, A.E. 1939. The application of the theory of absolute reaction rates to proteins. Chem. Rev. 24: 253270.[CrossRef]
Fersht, A.R. 1997. Nucleation mechanisms in protein folding. Curr. Opin. Struct. Biol. 7: 39.[CrossRef][Medline]
. 1998. Structure and mechanism in protein science: A guide to enzyme catalysis and protein folding. Freeman, New York.
Flammini, A., Banavar, J.R., and Maritan, A. 2002. Energy landscape and native-state structure of proteinsA simplified model. Europhys. Lett. 58: 623629.[CrossRef]
Itzhaki. L.S., Otzen, D.E., and Fersht, A.R. 1995. The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: Evidence for a nucleation-condensation mechanism for protein folding. J. Mol. Biol. 254: 260288.[CrossRef][Medline]
Jackson, S.E. and Fersht, A.R. 1991. Folding of chymoptrypsin inhibitor 2. 1. Evidence for a two-state transition. Biochemistry 30: 1042810435.[CrossRef][Medline]
Kazmirski, S.L., Wong, K.-B., Freund, S.M.V., Tan, Y.-J., Fersht, A.R., and Daggett, V. 2001. Protein folding from a highly disordered denatured state: The folding pathway of chymotrypsin inhibitor 2 at atomic resolution. Proc. Natl. Acad. Sci. 98: 43494354.
Klimov, D.K. and Thirumalai, D. 2001. Multiple protein folding nuclei and the transition state ensemble in two-state proteins. Proteins 43: 465475.[CrossRef][Medline]
Levy, Y., Jortner, J., and Becker, O.M. 2001. Solvent effects on the energy landscapes and folding kinetics of polyalanine. Proc. Natl. Acad. Sci. 98: 21882193.
Li, A. and Daggett, V. 1994. Characterization of the transition state of protein unfolding by use of molecular dynamics: Chymotrypsin inhibitor 2. Proc. Natl. Acad. Sci. 91: 1043010434.
Li, L., Mirny, A.L., and Shakhnovich, E.I. 2000. Kinetics, thermodynamics and evolution of non-native interactions in a protein folding nucleus. Nat. Struct. Biol. 7: 336342.[CrossRef][Medline]
Matouschek, A., Kellis Jr., J.T., Serrano, L., and Fersht, A.R. 1989. Mapping the transition state and pathway of protein folding by protein engineering. Nature 342: 122126.
Matouschek, A., Kellis Jr., J.T., Serrano, L., Bycroft, M., and Fersht, A.R. 1990. Transient folding intermediates characterized by protein engineering. Nature 346: 440445.[CrossRef][Medline]
Matouschek, A., Serrano, L., and Fersht, A.R. 1992. The folding of an enzyme. IV. Structure of an intermediate in the refolding of barnase analyzed by a protein engineering procedure. J. Mol. Biol. 224: 819835.[CrossRef][Medline]
Munoz, V. and Eaton, W.A. 1999. A simple model for calculating the kinetics of protein folding from three-dimensional structures. Proc. Natl. Acad. Sci. 96: 1131111316.
Munoz, V., Thompson, P.A., Hofrichter, J., and Eaton, W.A. 1997. Folding dynamics and mechanism of
-hairpin formation. Nature 390: 196199.[CrossRef][Medline]
Munoz, V., Henry, E.R., Hofrichter, J., and Eaton, W.A. 1998. A statistical mechanical model for
-hairpin kinetics. Proc. Natl. Acad. Sci. 95: 58725879.
Oliveberg, M. 2001. Characterisation of the transition states for protein folding: Towards a new level of mechanistic detail in protein engineering analysis. Curr. Op. Str. Biol. 11: 94100.[CrossRef][Medline]
Oliveberg, M., Tan, Y.-J., and Fersht, A.R. 1995. Negative activation enthalpies in the kinetics of protein folding. Proc. Natl. Acad. Sci. 92: 89268929.
Onuchic, J.N., Wolynes, P.G., Luthey-Schulten, Z., and Socci, N.D. 1995. Toward an outline of the topography of a realisti