Protein Science Sheba protein
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Protein Science (2004), 13:358-369. Published by Cold Spring Harbor Laboratory Press. Copyright © 2004 The Protein Society
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Leonhard, K.
Right arrow Articles by Radke, C. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Leonhard, K.
Right arrow Articles by Radke, C. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Solvent–amino acid interaction energies in three-dimensional-lattice Monte Carlo simulations of a model 27-mer protein: Folding thermodynamics and kinetics

Kai Leonhard1, John M. Prausnitz1,2 and Clayton J. Radke1,3

1 Department of Chemical Engineering, University of California, Berkeley, California 94720-1462, USA
2 Chemical Sciences Division and
3 Earth Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA

Reprint requests to: Clayton J. Radke, Department of Chemical Engineering, University of California, Berkeley, CA 94720-1462, USA; e-mail: radke{at}cchem.berkeley.edu; fax: (510) 642-4778.


    Abstract
 TOP
 Abstract
 Introduction
 The MJ interaction model
 A new interaction model
 Results
 Discussion
 Materials and methods
 References
 
Amino acid residue–solvent interactions are required for lattice Monte Carlo simulations of model proteins in water. In this study, we propose an interaction-energy scale that is based on the interaction scale by Miyazawa and Jernigan. It permits systematic variation of the amino acid–solvent interactions by introducing a contrast parameter for the hydrophobicity, Cs, and a mean attraction parameter for the amino acids, {omega}. Changes in the interaction energies strongly affect many protein properties. We present an optimized energy parameter set for best representing realistic behavior typical for many proteins (fast folding and high cooperativity for single chains). Our optimal parameters feature a much weaker hydrophobicity contrast and mean attraction than does the original interaction scale. The proposed interaction scale is designed for calculating the behavior of proteins in bulk and at interfaces as a function of solvent characteristics, as well as protein size and sequence.

Keywords: Lattice simulation; interaction energies; protein folding; aggregation; solvation

Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.03198204.


    Introduction
 TOP
 Abstract
 Introduction
 The MJ interaction model
 A new interaction model
 Results
 Discussion
 Materials and methods
 References
 
In an attempt to understand folding and aggregation of proteins in solution, a large number of studies have been reported (Laurents and Baldwin 1998; Gruebele 1999). This work contributes toward that understanding. By using 27-mer chains of amino acids in a solvent (typically, water), three-dimensional (3D) Monte Carlo (MC) simulations are performed using different values of the characteristic solvent–amino acid energy.

The biological function of a protein is coupled strongly to its conformation or tertiary structure. For native proteins, this conformation is believed to correspond to the global minimum of free energy (Asherie et al. 1998; Fink 1998; Harrison et al. 1999). The order of free-energy levels of different conformations can change in the presence of other molecules (macromolecules such as chaperones or amyloids, or small molecules such as urea or sugars) and in the presence of interfaces (Srebnik et al. 1998; Anderson et al. 2000). A change in conformation usually restrains the biological function of a protein and may lead to precipitation (Asherie et al. 1998). Conformation change is relevant for the production (Georgiou and Bowden 1990), formulation, and application (Costantino et al. 1995) of protein-containing drugs and nutrition supplements. Aggregation also plays a crucial role in several diseases, including Alzheimer’s (Mitraki and King 1989; Wetzel 1994, 1996; Janicke 1995; Silow and Oliveberg 1997; Fink 1998) and bovine spongiform encephalopathy (BSE)/scrapie (Prusiner 1991; Cohen et al. 1994; Prusiner and DeArmond 1994). Despite the importance of protein aggregation, much remains to be learned.

In the vast majority of previous folding and aggregation studies, one of three interaction-energy models is used: First, in the Go model (1983), two amino acid beads interact with energy -{varepsilon} if they are in contact in the native state; otherwise, the interaction energy is zero. Many studies (Du et al. 1999; Pande and Rokhsar 1999; Bratko and Blanch 2001, 2003; Cheung et al. 2002), among others, adopt the Go model to study protein folding and aggregation in bulk water. In the Go model, each bead is unique and attracts only those beads that are neighbors in the native state. For a lattice protein of N beads, the Go model is an N-letter model with a very high specificity. In a Go-model protein, more different amino acid beads can appear than in nature. Second, the HP model contains two different kinds of amino acid beads: hydrophobic and polar (Lau and Dill 1989). In the original HP model, only the H–H attraction is different from zero. In later studies, nonzero H–P and P–P interactions have also been investigated (Li et al. 1996; Gupta et al. 1998; Hirst 1999; Giugliarelli et al. 2000; Crippen and Chhajer 2002). The HP model belongs to the class of two-letter models because it uses two different kinds of beads. In the HP model, no different strengths of hydrophobicity and no specific interactions between certain pairs of amino acids exist. Third, the more complex Miyazawa-Jernigan model (MJ; Miyazawa and Jernigan 1985) is a 20-letter model used, for example, by Tiana et al. (1998), Broglia et al. (1998), Anderson et al. (2000), and Mirny et al. (1998). It has the advantage of providing experimentally determined interaction energies for all 20 amino acids, allowing for different degrees of hydrophobicity and polarity and specific amino acid interactions. For example, the self-interaction of charged amino acids is very weak, but oppositely charged amino acids attract each other quite strongly. Our interaction-energy model is an extension of the MJ scheme. Miyazawa and Jernigan later (1996) refined their interaction-energy set based on a larger amalgam of experimental data. Interaction energies for most amino acids changed only slightly. We adopt the original interaction set in this work because the new MJ potentials contain a nonpairwise-additive packing term that makes simulation computationally demanding and that makes isolation of energy-scale effects more difficult.


    The MJ interaction model
 TOP
 Abstract
 Introduction
 The MJ interaction model
 A new interaction model
 Results
 Discussion
 Materials and methods
 References
 
In the MJ picture, the quasichemical approximation underlies the contact energies between amino acid residues obtained from the statistical number of contacts between amino acids observed in the crystal structures of real proteins. To include solvent effects into effective residue interactions, the accessible area of amino acids is considered to be covered by solvent molecules. Therefore, in a strict sense, the MJ "energies" are not actual energies but free energies or temperature-dependent mean field energies. In this work, as in other simulation studies, MJ free energies are used as interaction energies. For the 20 amino acids and the solvent, the interaction energies can be written in a 21 x 21 matrix. A rescaling transformation of the interaction matrix sets the amino acid–solvent interactions Ei0(i = 1...20) and the solvent–solvent interaction equal to zero, allowing the solvent to be modeled implicitly in simulations and saving computer time. However, the 20 amino acid–solvent interaction energies are now embedded in the 210 amino acid–amino acid interaction energies with physical significance that is no longer readily identified.

Miyazawa and Jernigan (1985) made several approximations in the process of deriving their interaction energies from X-ray–structure contact frequencies. As a result, the average amino acid–solvent contact energy is about twice as large as almost all of those from previous estimates of hydrophobic energies.

Because of the unrealistic difference in hydrophobic energies and because of other results (see below) reported by several investigators, we reconsider the MJ interaction model. Our re-examination is motivated by the following: First, aggregation studies with the MJ energy scale (Broglia et al. 1998) for two 36-mers in water showed that irreversible aggregation occurs for all amino acid sequences studied whenever two strongly attractive sites of the stable cores of the chains come into contact during the folding process. However, in nature, this irreversible aggregation seems to be the exception rather than the rule for proteins under physiological conditions (West et al. 1999). Second, adsorption studies of a 27-mer at an oil–water interface (Anderson et al. 2000) showed reasonable physical behavior of a single chain in the bulk (reversible folding) and at the interface (irreversible adsorption), when a hydrophilic amino acid (histidine) was chosen to represent water and a hydrophobic amino acid (glycine) represented oil. However, by using these same assignments for "water" and "oil," our present simulations with two identical chains exhibit irreversible aggregation in the bulk and a very weak, reversible aggregation at the inferface. Because this calculated result is essentially the opposite of what we would expect from experimental data (Cascão Pereira et al. 2002, 2003), we want to examine the effect of changes in the pertinent energy parameters. Third, Giugliarelli et al. (2000) performed two-dimensional (2D) lattice MC simulations with a two-letter HP model to study the effect of interaction energies on the formation of compact states (states with minimal surface) or incompact native states for single chains and for bulk aggregation. These investigators found three classes of protein-like sequences: nonaggregating, aggregating in the native state (crystallizing), and prion-like aggregates. For a given amino acid sequence or composition, the type of behavior depends strongly on the interaction energies. Fourth, Gutin et al. (1995) studied the effect of mean attraction in three dimensions, not of a contrast parameter. Our effort focuses on the roles of a mean attraction and of the degree of contrast among solvent–hydrophilic interactions and solvent–hydrophobic interactions. Fifth and finally, Booth et al. (1997) found experimentally that lysozyme, which is known not to form any aggregates in vivo, does aggregate in vitro under appropriate conditions. The literature indicates that almost any protein can form aggregates under some conditions (Booth et al. 1997), such as the temperature, solvent pH, and salt concentration, or the presence of co- or antisolvents. These solvent characteristics can, in principle, be modeled with lattice simulations by the use of effective amino acid–solvent energy parameters.


    A new interaction model
 TOP
 Abstract
 Introduction
 The MJ interaction model
 A new interaction model
 Results
 Discussion
 Materials and methods
 References
 
In this work, we generate amino acid residue–solvent interaction energies that are consistent with the relative interactions present in the MJ model but that allow us, by changing one parameter, to alter the trend of our lattice proteins to prefer compact structures. In addition, with a second parameter, our interaction energy scale allows us to adjust the difference between two interaction energies, that is, the interaction energy between a hydrophobic amino acid and the solvent and the interaction energy between a hydrophilic amino acid and the solvent. By using these two parameters, we design solvents of different polarity, study protein folding, and aggregation in these solvents, and then, upon comparing with experimentally observed behavior for aqueous protein solutions, we decide which of them is most conducive to represent real protein behavior.

To elucidate the basis of our interaction-energy model, we recall some features of lattice simulations and the associated characteristic energies. Consider the exchange of the bond between a pair of two identical amino acid sites (Ai)2 and the bond between two solvent molecules, S, to form a bond between the amino acid site and the solvent AiS:


The energy difference for this exchange reaction is obtained from the energies of the pertinent species: 2Ei0 - E00 - Eii (for i = j, see equation 5a in Miyazawa and Jernigan 1985). Ei0 is the absolute interaction energy between amino acid i and the solvent, E00 is the absolute interaction energy between two solvent sites, and Eii is the absolute interaction energy between two amino acids of the same type i. One half of this reaction energy is known as exchange energy, {omega}i. Miyazawa and Jernigan present relative energies denoted by a small e instead of absolute energies (denoted by capital E). Clearly, the definition of the exchange energy holds for the relative as well as for the absolute energies. Because e00 is set to zero by the rescaling transformation in the original MJ energy scale, as well as in our energy scale, {omega}i = ei0 - 1/2eii. Therefore, it is the exchange energies {omega}i, and not the amino acid–solvent interaction energies, ei0, that are the distinctive energies determining whether the amino acid–solvent interaction is effectively attractive ({omega}i < 0) or repulsive ({omega}i > 0). Nevertheless, we still present the amino acid~solvent interaction energies ei0 because they are the quantities that are used in the simulations.

We propose a new expression for the interaction energy ei0 to model the solvent properties,


(1)

or equivalently:


(2)

where eii and all other interaction energies eij are taken from Table 5 of Miyazawa and Jernigan (1985). Two parameters arise in equations 1Go and 2Go. The parameter determines how well the solvent likes amino acids in general. Strongly negative values promote unraveled forms of the protein, whereas strongly positive values stabilize compact forms. Cs is a contrast parameter. If it is close to zero, the solvent solvates hydrophobic amino acids approximately as well (or as poorly) as hydrophilic amino acids, because in this case . A positive Cs value means that the solvent prefers hydrophilic acids over hydrophobic ones (i.e., {omega}i is relatively more negative for the hydrophilic residues). As Cs is raised toward unity, the solvent more strongly prefers the hydrophilic acids relative to the hydrophobic acids. Consequently, the contrast between the hydrophilic and hydrophobic sites is magnified. With Cs < 0, oil-like solvents can be modeled, because in this case, contacts between the solvent and hydrophobic amino acids are preferred relative to the hydrophilic amino acids. When Cs = 1 and , the original MJ model is recovered.

To avoid the use of explicit amino acid– solvent interaction energies, thereby fascilitating fast simulations, the matrix transformation described by Miyazawa and Jernigan (1985) can be applied easily:


(3)

where denotes the new relative interaction energies for the implicit-solvent simulations. However, if simulations with more than one solvent are performed (e.g., to study proteins at a fluid phase interface), the solvents can no longer be considered implicitly. Because our simulation code is designed to perform later such interface simulations, we use explicit amino acid–solvent interaction energies here. In multiple-solvent simulations, three kinds of solvent parameters are necessary: Csi and are single-solvent parameters, and e0i0j is the binary solvent–solvent interaction energy for solvents i and j. All solvent–solvent self-interaction energies e0i0i can be set to zero.

The sign of does not reveal whether the amino acid–solvent interaction is attractive or repulsive until Cs is considered as well. Therefore, we eliminate in favor of the perhaps physically more meaningful parameter:


(4)

or


(5)

where the sum runs over the 20 types of amino acids; that is, n = 20.

When is negative, amino acids interact on average attractively with the solvent. When is positive, they repel the solvent and effectively attract each other. Because it is more convenient to think in terms of amino acid interactions than in terms of amino acid–solvent interactions in the rest of this article, we call the "effective amino acid–amino acid mean attraction" or briefly "mean attraction." Although based on exchange energies, a positive means an effective amino acid attraction in this context.

Equation 5Go is substituted into equations 1Go and 2Go to eliminate :


(6)

or


(7)

To illustrate the meaning of the two energy-scale parameters Cs and , Figure 1Go graphs the exchange energy {omega}i versus the amino acid~amino acid self-interaction energy eii. Results are shown for values of Cs and that we find to be typical for relatively fast folding interaction scales of our 27-mer protein. For comparison, data for the original MJ model are also shown as closed circles. Open circles and diamonds show the effect of Cs when is approximately constant (-1/2Cs is the slope). As the slope becomes more negative, there are increasing differences in hydrophobicity among the 20 amino acids. Open circles and squares demonstrate different values of with the same . A high-lying data set means a strong mean attraction and features compact states, whereas a low-lying data set with less mean attraction promotes noncompact states. Figure 1Go also shows that the amino acids are not evenly spaced along the {omega}i energy scale but rather fall into groups of different polarity, as they do in the eii scale of the MJ model. This relative polarity grouping of the MJ scale is also present in our interaction energy model. In the MJ scale (filled circles), the contrast parameter (Cs = 1) and the mean attraction are much stronger than those in the parameter sets with fastest folding and highest cooperativity properties. Only 19 symbols per data set can be seen in the figure because glutamine and proline have exactly the same self-interaction energy.



View larger version (17K):
[in this window]
[in a new window]
 
Figure 1. Plot of the exchange energy for some selected amino acid–water interaction energy scales. Positive values of {omega}i indicate net repulsive interactions with the solvent. Hydrophobic amino acids are shown on the left; polar ones, on the right. (Open circles) Cs = 0.2 and ; (open squares) Cs = 0.2 and ; (open diamonds) Cs = 0.4 and ; and (filled circles) Cs = 1 and (original MJ scale).

 
There is some evidence to believe that real proteins are designed by nature not only for stability, as most lattice-model proteins are, but to provide a compromise between stability and fast folding (Mirny et al. 1998; Gruebele 1999). It is also known that natural proteins have a higher cooperativity than do proteins in lattice-based models (Kaya and Chan 2000). Cooperativity measures how sharply the transition takes place from the unfolded state to the folded state (and vice versa) based on the temperature change, {Delta}T, that dictates this transition. A small number for {Delta}T/Tm, where Tm is the midpoint temperature, means that the structure change from the unfolded to folded state takes place in a narrow temperature range. Therefore, the transition is sharp and the cooperativity is high when {Delta}T/Tm is small ({Delta}T/Tm and Tm are defined quantitatively in Materials and Methods). Thus, to compare the model 27-mer protein with actual proteins, we examine the effect of our energy-parameter space (i.e., Cs and ) on folding speed and cooperativity.


    Results
 TOP
 Abstract
 Introduction
 The MJ interaction model
 A new interaction model
 Results
 Discussion
 Materials and methods
 References
 
First, MC simulations were performed with three different native conformations of 27-mers in water (the native conformation is the path the chain takes through the 3 x 3 x 3 cube). Details of our simulation procedures are presented in Materials and Methods. From the three conformations, the one shown in Figure 2Go folds fastest. It was, therefore, used in all further 27-mer simulations. Table 1Go shows the midpoint temperatures Tm, the folding times tF (defined in Materials and Methods) at Tm, and the number of total contacts Ntot at Tm for the sets of amino acid–water energy parameters studied with the chosen native conformation. For many of the parameter sets, the cooperativity has also been determined and is reported. As expected, a higher effective amino acid–amino acid mean attraction correlates with a higher midpoint temperature. However, the mean folding times and the cooperativities give more diagnostic results. For each Cs, there is an optimal with respect to folding speed and cooperativity. The mean attraction necessary for a strong cooperativity is slightly higher than that for fastest refolding.



View larger version (60K):
[in this window]
[in a new window]
 
Figure 2. Schematic of the native state of a compact 27-mer (Cs = 0.2 and ). Light gray is used for hydrophilic amino acid beads; medium gray, for neutral; and dark gray, for hydrophobic beads. Each bead is labeled with the one-letter abreviation for the amino acid it represents.

 

View this table:
[in this window]
[in a new window]
 
Table 1. Midpoint temperatures, Tm; mean folding times, tF; number of total contacts: Ntot at Tm; and cooperativity, {Delta}T/Tm, for the interaction-energy parameter sets for 27-mers
 
Figure 3Go summarizes pictorially the effect of Cs and on folding speed and cooperativity, whereas precise values for these parameters are given in Table 1Go. Folding time is represented by the diameter of circles; {Delta}T/Tm, by the size of diamonds. Small symbols denote high folding speed and strong cooperativity and are, therefore, desireable. For each given contrast parameter, Cs, we find an optimal value of for the folding speed and a slightly higher optimum value for the cooperativity. The reason for this is as follows: If the mean attraction is not too strong , some of the hydrophilic amino acids interact more favorably with water than with any amino acid. When there are only a few such amino acids and these beads are not in terminal positions, the maximum compact state (Fig. 2Go) is still the lowest energy state because the water-liking amino acids are forced in place by their neighbors. Although this effect elevates the energy level of the folded state, it decreases the free-energy barrier of the transition state between the folded and the unfolded states and, therefore, facilitates fast folding. When the number of such highly hydrophilic amino acids increases, there is a point at which the most stable state is one with only 26 or even 23 of the contacts present in the 3 x 3 x 3 cube, rather than 28 characteristic of the maximum compact one. Figure 4Go shows the most stable structure for all 27-mer simulations with Cs = 0.2 and . Here some of the terminal beads are completely immersed in water. The free-energy barriers between the fully folded and the unfolded states are very low at the midpoint temperature, and folding is very fast. Under these conditions, significant cooling is required to condense the chain to the fully folded state; in some cases, even a very low temperature cannot achieve condensation. Table 2Go shows the number of native contacts at a lower temperature of T = 0.85Tm as function of the mean attraction. For , the chain is basically folded, whereas simulations with a weaker mean attraction show much less native structure. The opposite is true for a temperature above the mean temperature: The chain unfolds well when the mean attraction is weak and maintains many contacts when the mean attraction is large. These two effects reflect the decreased cooperativity for too weak or too strong mean attractions.



View larger version (15K):
[in this window]
[in a new window]
 
Figure 3. Comparison of water interaction energy scales used for the 27-mer. Circles indicate refolding time tf; the diameter of the circles is proportional to tf. Diamonds indicate cooperativity {Delta}T/Tm. The cooperativity is only determined for some interaction assignments.

 


View larger version (59K):
[in this window]
[in a new window]
 
Figure 4. Schematic of the native state of a noncompact 27-mer (Cs = 0.2 and ). Light gray is used for hydrophilic amino acid beads; medium gray, for neutral; dark gray, for hydrophobic beads. Each bead is labeled with the one-letter abreviation for the amino acid it represents.

 

View this table:
[in this window]
[in a new window]
 
Table 2. Native contacts, total contacts, and folding time for single-chain–27-mer simulations at T = 0.85Tm
 
For high values of , there is always a high energy barrier for a transition from one relatively compact state to another one because, in between, there is a less compact transition state that limits refolding speed. Also, the unfolded state is more compact than that for a weak mean attraction; the system must be heated substantially to unfold completely. These two effects decrease the cooperativity. For a strong mean attraction, the number of total contacts at the midpoint temperature is substantially higher than for a weak mean attraction, whereas the number of native contacts is always 14, by definition. This means that more nonnative contacts are formed in the case of a high mean attraction, or in other words, in this case, the interactions are less specific.

Table 1Go and Figure 3Go show that there is a single optimal value of Cs for fast folding and strong cooperativity. It is ~0.2. If Cs is larger, transition states with an exposed hydrophobic amino acid are too unfavorable for fast folding. Conversely, because there is too little energy difference between hydrophilic residues–water and hydrophobic residues–water interactions, small Cs values provide only a weak force to drive hydrophobic amino acids into the core of the folded protein.

Figures 5Go and 6Go illustrate the difference between a simulation with a strong and a weak mean attraction by plotting the number of native contacts, Nnat, versus the number of MC steps. Both simulations are performed at Tm with the contrast parameter that features fastest folding (Cs = 0.2). If the mean attraction is moderately strong (cf. Fig. 5Go), the maximum compact state is the most stable state. This can be seen from the high density of states with 28 native contacts. The folded state is well separated from the unfolded state in terms of native contacts, and the cooperativity is high. If the mean attraction is weak, as in Figure 6Go, the range of native contacts for the unfolded state (zero to 12 native contacts) is closer to the number of contacts for the folded state (16 to 28 native contacts) and the fully folded state (Nnat = 28) is less populated than a semifolded state with 23 to 26 native contacts.



View larger version (61K):
[in this window]
[in a new window]
 
Figure 5. Folding and unfolding kinetics at Tm represented by the number of native contacts as a function of MC steps. Cs = 0.2 and .

 


View larger version (66K):
[in this window]
[in a new window]
 
Figure 6. Folding and unfolding kinetics at Tm represented by the number of native contacts as a function of MC steps. Cs = 0.2 and .

 
Another quantity that can provide information about the folding–unfolding equilibria is the profile of the reduced free energy F/kBT of this process, where kB is the Boltzmann constant in MJ-energy units divided by associated MJ-temperature units. We present this profile in Figure 7Go for some of our folding simulations versus the number of native contacts, Nnat. There are three dashed lines representing different mean attractions, all at the respective midpoint temperatures. The free energy of the unfolded state (5 < Nnat < 10) is approximately the same as the free energy of the fully folded state, Nnat = 28, that is, arbitrarily set to zero in this graph. The two free energies are not exactly the same, because in our work, the midpoint temperature is defined via the number of native contacts, not via the free energy. When we move toward the right in Figure 7Go from the left minimum, which corresponds to the unfolded state, the increase in free energy corresponds to a folding barrier. Then, we encounter a minimum at Nnat = 23 for all mean attraction strengths that corresponds to the noncompact conformation shown in Figure 4Go. As stated above, this conformation has a lower energy than the fully folded state, Nnat = 28, for ; for the other two mean attractions shown, the state with Nnat = 28 has the lowest energy. The state with 23 native contacts can be realized in about twice as many ways as the one with 28 contacts, which causes a negative contribution of the entropy to the free energy of the 23-native-contacts state. These two effects lead to the differences in free energy between the Nnat = 23 state and the Nnat = 28 state that we can observe in Figure 7Go. When we move further toward a high number of native contacts, there is another free-energy barrier at 24 and 25 native contacts before the free energy declines again at 26 and 28 native contacts. Therefore, our model protein is a three-state folder, unlike most small actual proteins. From an analysis of the folding kinetics, we establish that the main path for the folding of a semifolded chain with 23 native contacts to one with 28 contacts goes via a 24- and 26-contact route. The state with 25 contacts is not needed for this event; a state with 27 does not exist. The cause for the three folding states is the heterogeneous distribution of the particular chosen contact energies in the native state of our chain (Abkevich et al. 1996), which is, in turn, a result of the structure of the native conformation, and the amino acid sequence and composition. Our energy scale plays only a minor role in determining whether the model protein is a two- or three-state folder.



View larger version (21K):
[in this window]
[in a new window]
 
Figure 7. Reduced-free-energy profile of the folding–unfolding equilibrium, Cs = 0.2. The various lines correspond to different values and temperatures, as labeled.

 
Because the states with 23 to 28 native contacts differ only by the positions of three of the four terminal beads on one end of the chain and because the free energies of three of these states (those with 23, 26, and 28 native contacts) are very similar, the definition of the native state is somewhat arbitrary. In this article, our results for the 27-mer are based on the maximally compact native state with 28 native contacts, in accord with the usual procedure in the literature (Shakhnovich and Gutin 1993; Abkevich et al. 1994; Shakhnovich 1994; Dinner et al. 1996; Li et al. 1996; Broglia et al. 1998; Mirny et al. 1998; Tiana et al. 1998; Du et al. 1999; Harrison et al. 1999; Pande and Rokhsar 1999; Anderson et al. 2000; Bratko and Blanch 2001; Castells et al. 2002). We have also evaluated the effect of a definition of the native state based on only 23 native contacts, which is indicated by our free-energy profile and has also been suggested by Crippen and Chhajer (2002). By using Nnat >= 23 instead of Nnat = 28 as definition of the native state, the free energy of the folded state is obviously reduced, but the results do not change qualitatively.

Figure 7Go also contains the energy profile of one chain that is 90% folded and one that is 90% unfolded for the medium mean attraction. In the high-temperature simulation, T = 1.81Tm, the free energy has a minimum at two native contacts, whereas >10 are rarely observed. In the low-temperature simulation, T = 0.76Tm, however, a rather large range of native contacts can still be seen, and the minimum at Nnat = 28 is not as pronounced as one might expect. Note that when a line ends, the corresponding values of the order parameter have not been sampled during the simulation. This means that the free energies of these states are very high but cannot be determined without applying biasing techniques.


    Discussion
 TOP
 Abstract
 Introduction
 The MJ interaction model
 A new interaction model
 Results
 Discussion
 Materials and methods
 References
 
We have introduced a new interaction-energy scale, extended from the original MJ interaction scale, for the simulation of model proteins on a cubic lattice. We introduce two energy parameters, Cs and , that modify the MJ amino acid–solvent interaction energies and permit us to study the effect of the solvent on protein folding. In the parameter space studied (selected combinations of 0.05 <= Cs <= 1.0 and ) and with 27-mers, we discover four regimes corresponding to different protein behavior. Figure 8Go displays these four regions in space and their approximate boundaries. Protein simulations using parameters from regions I and IV show a behavior that is not typical for real proteins under native conditions. Region I reflects slow folding and weak cooperativity. The original MJ interaction-energy matrix lies in region I. In region IV, the most compact state is highly unstable, which results in slow folding and a low cooperativity. There is a set of medium compact states with similar energies, in which essentially no folding–unfolding event is observed. Energy parameters characteristic of regions II and III can be used to study actual proteins, because they exhibit the most realistic behavior and allow the shortest simulation times. In region II, the 3 x 3 x 3 cube is the lowest energy state, and it exhibits the highest cooperativity of all interaction models and fast folding. Therefore, region II corresponds to proteins having a compact native state that is well defined for all amino acids. Many real proteins show some flexibity even in the native state; parts of the chain cannot be resolved in X-ray crystal structures (Crippen and Chhajer 2002). Such proteins can be modeled with parameters from region III; they show the fastest folding and also a relatively high cooperativity. In our study, the reference state for folding rate and cooperativity is the compact cube. If this reference state is changed to a noncompact state, as recently suggested by Crippen and Chhajer (2002), proteins with parameters from region II will likely show even "better" behavior in the sense of fast folding and high cooperativity.



View larger version (9K):
[in this window]
[in a new window]
 
Figure 8. Approximate boundaries between four regions of different chain behavior in parameter space. The most compact state (3 x 3 x 3 cube; cf. Fig. 2Go) is the lowest energy state above the dashed line; semicompact states are the lowest energy states below this line (cf. Fig. 3Go). Region I indicates slow folding and weak cooperativity; region II, highest cooperativity and fast folding; region III, fastest folding and a noncompact native state; and region IV, slow folding. There is a set of medium compact states with similar energies. Parameters taken from region II are most suitable to study protein behavior with a compact native state. Parameters taken from region III can be used to model semicompact proteins. The original MJ interaction scale (Cs = 1 and ) lies in region I outside of the graph.

 
Gutin et al. (1995) used a modified MJ energy set that contains one parameter to adjust the overall attraction of amino acid beads similar to our parameter , but they keep the original MJ contrast unchanged. These investigators studied the folding of a 36-mer at ~0.9 Tm for a relatively strong and a relatively weak mean attraction. Gutin et al. found the formation of a compact globule as a first step for the strong-mean-attraction case, followed by a first-order transition to the native state. Similar to our results, during their simulations, the number of total contacts is much higher than the number of native contacts. For a weak mean attraction, most of the states that are formed during the folding process are native contacts, and the formation of a compact state happens at the same time as the formation of the native state. In contrast to our results, Gutin et al. found the mean folding time similar for the strong and the weak mean attraction. Apparently, these investigators used a lower temperature than that studied here. Also they determined the folding time of an unfolded chain to a folded one, whereas we measure the refolding time between an unfolded and a folded state when both are in equilibrium.

Our simulations indicate that the two possible design goals (fast folding and high cooperativity) for proteins are correlated. Therefore, if the system is optimized for one of the goals, the other one is rather close to its optimal value. This holds especially for the contrast parameter, Cs. For 27-mers, fastest folding requires a slightly lower mean attraction than does high cooperativity, whereas a mean attraction between these two optima remains very good for fast folding and high cooperativity. In many studies (Shakhnovich and Gutin 1993; Shakhnovich 1994; Dinner et al. 1996; Li et al. 1996; Tiana et al. 1998; Harrison et al. 1999), the importance of a generally high stability of the native state has been emphasized. Recently, this issue has been discussed with respect to interaction potential and sequence (Giugliarelli et al. 2000; Crippen and Chhajer 2002). In our study, the amino acid sequences of our lattice proteins for a given interaction parameter set are optimized for stability of a given native conformation. However, investigation of a series of interaction scales reveals that this interaction set should not be optimized for stability of the native state. Rather an interaction set that generates only marginally stable native states best features fast folding and high cooperativity, in agreement with experimental findings that proteins are not solely optimized for stability (Gruebele 1999).

Finally, we offer brief prospective remarks based on our work. First, the emphasis of this work is on the optimization of the amino acid–solvent interaction energies for native-protein–like behavior. Nevertheless, we assert that the results can be used to study sequence-related effects of protein aggregation. In doing so, energy-scale-related aggregation effects can be separated from sequence-related aggregation effects. This may aid in further understanding of the aggregation process. Second, the behavior of chains in water as the solvent is now modeled more realisticaly, and extension to other solvents is straightforward. In addition, the effect of changing solvent characteristics on protein folding can be studied. However, relations between our parameters and quantities such as pH, salt concentration, and the presence of co- or antisolvents remain to be established.


    Materials and methods
 TOP
 Abstract
 Introduction
 The MJ interaction model
 A new interaction model
 Results
 Discussion
 Materials and methods
 References
 
All simulations are performed on a 3D-cubic lattice with periodic boundary conditions and a size of 20 x 20 x 20 sites. Proteins are modeled by self-avoiding chains of beads, each bead representing a Kuhn segment (a Kuhn segment corresponds to approximately five amino acid residues and provides the flexibility to form bond angles of 90 or 180 degrees, consistent with a cubic lattice; Boyd and Phillips 1993). Although a Kuhn segment reflects several amino acids, we attribute to it a single amino acid–residue interaction energy. This simplificaition is reasonable as long as we are aware of the limitations and strengths of the model. Our lattice representation cannot be used to describe hydrogen bonds or secondary structure or to study the folding of a real sequence found in nature. On the other hand, simple lattice models are known to capture the governing principles of hydrophobic collapse, trapping, and folding rate (Gruebele 1999). We use a modified MJ energy scale here not to represent real amino acids but to describe the variety of different interactions that are present beyond those covered by commonly used HP models. All sites that are not occupied by a chain group are occupied by an effective solvent molecule (Miyazawa and Jernigan 1985) that is a lattice site of the same size as a Kuhn segment and that can represent a group of solvent molecules corresponding to the size of the segment.

The system energy is defined as the sum of the interaction energies over all pairs of adjacent sites on the lattice, but which are not neighbors on a chain. Accordingly, the system energy is described by the Ising-like Hamiltonian:


(8)

L is the size of the lattice in lattice units and e(i,j,k),(l,m,n)is the interaction energy between an amino acid or solvent bead on site (i, j, k) and one on site (l, m, n). During the simulation, MC moves are attempted as shown in Figure 9Go. Frequency of moves is set by randomly choosing a site of the chain with equal probability for all sites. If the chosen bead is an end site, we perform an end-bend move. If it is one before then end site, we perform an invert-bend move. Otherwise an invert-bend or crankshaft move is performed with 50% probability for each move. If we try to move a chain bead to a site occupied by another chain bead, the move is immediately rejected because e(i,j,k),(i,j,k) = {infty}. If the move is to a site occupied by a solvent bead, we try to exchange the protein bead with the solvent bead on this site. Then, the move is accepted or rejected depending on the energy change, E, associated with the move in accordance with the Metropolis criterion:



View larger version (10K):
[in this window]
[in a new window]
 
Figure 9. Sketch of the types of allowed MC moves (from left to right): end-bend, kink-jump, and crank-shaft.

 

(9)


(10)

where kB, the Boltzmann constant, has the value of 1 MJ-energy unit/1 associated temperature unit. MJ energies are based on contact frequencies, and hence, they are only relative energies. To our knowledge, no conversion to conventional energy units is available. Therefore, for our purpose here, temperature can be expressed in any arbitrary unit.

Sequence and initial conformation
To design a lattice protein, it is first necessary to choose a conformation for the native state, that is, the path the chain takes in the 3 x 3 x 3 cube for a 27-mer. We undertook preliminary studies with a few different native conformations and then settled on the one shown in Figure 2Go for all simulations with 27-mers. The types of the 27 amino acid beads for a chain are chosen to reflect a composition corresponding to a typical real protein. Then the sequence is optimized by simulated annealing, meaning that sites on the chain in the native state are randomly exchanged with the Metropolis algorithm. During the run, the temperature is reduced eventually to find the global energy minimum. The sequence corresponding to this minimum is the most stable sequence for the given native conformation and interaction energy scale. To ensure that the global minimum is found, the run is then repeated, starting from the result of the previous run until no further improvement in stability is found.

A change in the mean attraction has no effect on the energy difference between different sequences for the same structure, but the contrast parameter (Cs) does. Therefore, the annealing process is repeated every time Cs is changed, but not when is altered. Table 3Go shows all sequences used in our simulations.


View this table:
[in this window]
[in a new window]
 
Table 3. One-letter code for the sequences of amino acids used in the simulations
 
The terminal beads of a chain have five interacting neighbors, but all others have only four. This difference in the number of neighbors makes the terminal beads seem more buried inside the protein structure than they actually are. Therefore, the annealing algorithm sometimes places strongly hydrophobic amino acids on these end beads, even if they are located at a corner. To avoid such unphysical results, we assign neutral or hydrophilic amino acids to the ends of our chains and add a restraint to the algorithm to keep them unchanged during the annealing process. The fact that a chain optimized with fixed end sites exhibits a higher folding rate and a higher midpoint temperature than does a freely optimized one demonstrates that the result of the free optimization is an artifact of the properties of a lattice chain. Simulations are started either from the native structure of the chain or from the equilibrated result obtained from a previous simulation. The concentration of the chains in our simulation box is ~0.3 volume %.

Data analysis
Contacts
During the simulation, we compute and record different types of contacts in addition to the energy. The number of total contacts Ntot is the number of contacts between the beads of one chain and all other amino acid beads on any chain, omitting contacts between the two neighbors that are consecutive on the same chain (one neighbor for a terminal bead). A contact between two beads exists if the beads are nearest neighbors on the lattice. The number of native contacts, Nnat, refers to those of the intramolecular contacts that are also present in the native state. Nnat provides a commonly used measure of the similarity of a given structure to the native structure. The number of total contacts gives information on how compact a protein is under given conditions, similar to the radius of gyration, but with many contacts diagnosing a compact molecule.

Midpoint temperature
Protein-like polymers exhibit a folding transition: They are unfolded at high temperatures and folded at low temperatures. The midpoint temperature Tm is defined as the temperature where, on average, half of the native contacts are present during a simulation. For a 27-mer with a 3 x 3 x 3 cube as native state having 28 native contacts, this means that Tm is the temperature at which Nnat = 14.

Cooperativity
Several definitions of cooperativity are discussed by Crippen and Chhajer (2002). We use the simplest of those for which the cooperativity, {Delta}T/Tm, is calculated from the temperatures where, on average, we observe 0.9, 0.5, and 0.1 times the maximum number of native contacts:


(11)

Again, a low value of {Delta}T/Tm corresponds to a high cooperativity and vice versa.

Folding time
Because of its importance for proteins, we also determine the folding time in our simulations. During the simulation, we record when a chain changes from a completely folded state to one with <40% of its native contacts present, or vice versa. The average number of MC time steps for one folding/unfolding cycle is designated as the mean folding time tf.

Blocking algorithm
We analyze the results of our simulations with a blocking method described by Flyvbjerg and Petersen (1989). This method permits reliable estimates for statistical errors if the simulation is ergodic, that is, if it samples all relevant states sufficiently. The blocking algorithm also helps us to recognize when our chains are in a glassy state.

Free energy
In a MC simulation, it is not possible to calculate absolute free energies. However, it is posssible to calculate free energy differences between states that can be observed in MC simulations by counting the numbers of occurences of the states (Frenkel and Smit 2002). We use the number of native contacts, Nnat, as an order parameter to create a histogram for each chain in a simulation, and we count how often the chain is in each state corresponding to the order parameter. After the simulation, the Landau free energy F(A) of state A relative to the fully folded state can be calculated from the number of times the chain is found in the state, Z(A), and from the number of times the chain is found in the fully folded state (Frenkel and Smit 2002).


(12)


    Acknowledgments
 
We thank Dusan Bratko for helpful discussions. For financial support we are grateful to DAAD (German Academic Exchange Service) for a fellowship to K.L. and to the Office for Basic Sciences of the United States Department of Energy.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.


    References
 TOP
 Abstract
 Introduction
 The MJ interaction model
 A new interaction model
 Results
 Discussion
 Materials and methods
 References
 
Abkevich, V.I., Gutin, A.M., and Shakhnovich, E.I. 1994. Free energy landscape for protein folding kinetics: Intermediates, traps, and multiple pathways in theory and lattice model simulations. J. Chem. Phys. 101: 6052–6062.[CrossRef]

———. 1996. Improved design of stable and fast-folding model proteins. Fold. Des. 1: 221–230.[CrossRef][Medline]

Anderson, R.E., Pande, V.S., and Radke, C.J. 2000. Dynamic lattice Monte Carlo simulation of a model protein at an oil/water interface. J. Chem. Fys. 112: 9167–9185.

Asherie, N., Pande, J., Lomakin, A., Ogun, O., Hanson, S.R.A., Smith, J.B., and Benedek, G.B. 1998. Oligomerization and phase separation in globular protein solutions. Biophys. Chem. 75: 213–227.[CrossRef][Medline]

Booth, D.R., Sunde, M., Belotti, V., Robinson, V., Hutchinson, C.V., Fraser, P.E., Hawkins, P., Dobson, C.M., Raiford, S.E., and Blake, C.C. 1997. Instability unfolding and aggregation of human lysozyme variants underlying amyloid fibrilogenesis. Nature 385: 787–793.[CrossRef][Medline]

Boyd, R.H. and Phillips, P.J. 1993. The science of polymer molecules. Cambridge Universty Press, Cambridge, UK.

Bratko, D. and Blanch, H.W. 2001. Competition between protein folding and aggregation: A three-dimensional lattice-model simulation. J. Chem. Phys. 114: 561–569.[CrossRef]

———. 2003. Effect of secondary structure on protein aggregation: A replica exchange simulation study. J. Chem. Phys. 118: 5185–5194.[CrossRef]

Broglia, R.A., Tiana, G., Pasquali, S., Roman, H.E., and Vigezzi, E. 1998. Folding and aggregation of designed proteins. Proc. Natl. Acad. Sci. 95: 12930–12933.[Abstract/Free Full Text]

Cascão Pereira, L.G., Hickel, A., Radke, C.J., and Blanch, H.W. 2002. Kinetic model for enzyme interfacial activity and stability: pa-hydroxynitrile lyase at the diisopropyl ether/water interface. Bioeng. Biotech. 78: 595–605.[CrossRef]

Cascão Pereira, L.G., Theodoly, O., Blanch, H.W., and Radke, C.J. 2003. Dilatational rheology of BSA conformers at the air/water interface. Langmuir 19: 2349–2356.[CrossRef]

Castells, V., Yang, S., and Van Tassel, P.R. 2002. Surface-induced conformational changes in lattice model proteins by Monte Carlo simulation. Phys. Rev. E 65: 031912–031918.[CrossRef]

Cheung, M.S., Garcia, A.E., and Onuchic, J.N. 2002. Protein folding mediated by solvation: Water expulsion and formation of the hydrophobic core occur after the structural collapse. Proc. Natl. Acad. Sci. 99: 685–690.[Abstract/Free Full Text]

Cohen, F.E., Pan, K.-M., Huang, Z., Baldwin, M., Fletterick, R., and Prusiner, S.B. 1994. Structural clues to prion replication. Science 264: 530.[Free Full Text]

Costantino, H.R., Langer, R., and Klibanov, A.M. 1995. Aggregation of lyophilized pharmaceutical protein, recombinant human albumin: Effect of moisture and stabilization by experiments. Biotechnology 13: 493–496.[CrossRef][Medline]

Crippen, G.M. and Chhajer, M. 2002. Lattice models of protein folding permitting disordered native states. J. Chem. Phys. 116: 2261–2268.[CrossRef]

Dinner, A.R., Sali, A., and Karplus, M. 1996. The folding mechanism of larger model proteins: Role of native structure. Proc. Natl. Acad. Sci. 93: 8356–8361.[Abstract/Free Full Text]

Du, R., Pande, V.S., Grosberg, A.Y., Tanaka, T., and Shakhnovich, E. 1999. On the role of conformational geometry in protein folding. J. Chem. Phys. 111: 10375–10380.[CrossRef]

Fink, A.L. 1998. Protein aggregation: Folding aggregates, inclusion bodies and amyloid. Fold. Des. 3: R9–R23.[CrossRef][Medline]

Flyvbjerg, H. and Petersen, H.G. 1989. Error estimates on averages of correlated data. J. Chem. Phys. 91: 461–466.[CrossRef]

Frenkel, D. and Smit, B. 2002. Understanding molecular simulation, Chapter 7. Academic Press, New York.

Georgiou, G. and Bowden, G.A. 1990. Inclusion body formation and the recovery of aggregated recombinant protein. In Recombinant DNA technology and applications (eds. A. Prokop et al.), pp. 333–356. McGraw Hill, New York.

Giugliarelli, G., Micheletti, C., Banavar, J.R., and Maritan, A. 2000. Compactness, aggregation, and prionlike behavior of protein: A lattice model study. J. Chem. Phys. 113: 5072–5077.[CrossRef]

Go, N. 1983. Theoretical studies of protein folding. Annu. Rev. Biophys. Bioeng. 12: 183.[CrossRef][Medline]

Gruebele, M. 1999. The fast protein folding problem. Annu. Rev. Phys. Chem. 50: 485–516.[CrossRef][Medline]

Gupta, P., Hall, C.K., and Voegler, A.C. 1998. Effect of denaturant and protein concentrations upon protein refolding and aggregation: A simple lattice model. Protein Sci. 7: 2642–2652.[Abstract]

Gutin, A.M., Abkevich, V.I., and Shakhnovich, E.I. 1995. Is burst hydrophobic collapse necessary for protein folding? Biochemistry 34: 3066–3076.[CrossRef][Medline]

Harrison, P.M., Chan, H.S., Prusiner, S.B., and Cohen, F.E. 1999. Thermodynamics of model prions and its implications for the problem of prion protein folding. J. Mol. Biol. 286: 593.[CrossRef][Medline]

Hirst, J.D. 1999. The evolutionary landscape of fundamental model proteins. Protein Eng. 12: 721–726.[Abstract/Free Full Text]

Janicke, R. 1995. Folding and association versus misfolding and aggregation of proteins. Philos. Trans. R. Soc. Lond. B 348