|
|
||||||||
1 Department of Cellular and Molecular Pharmacology, University of California, San Francisco, California 94143, USA
2 Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94143, USA
3 Department of Biochemistry and Biophysics, University of California, San Francisco, California 94143, USA
4 Department of Medicine, University of California, San Francisco, California 94143, USA
5 Department of Neurology, University of California, San Francisco, California 94143, USA
6 Department of Biochemistry, Faculty of Medicine, University of Toronto, Toronto, Ontario M5S 1A8, Canada
7 Department of Medical Genetics and Microbiology, Faculty of Medicine, University of Toronto, Toronto, Ontario M5S 1A8, Canada
Reprint requests to: Dr. Paul Harrison, Department of Molecular Biophysics and Biochemistry, Room 428, Yale University, 266 Whitney Avenue, P.O. Box 208114, New Haven, CT 06520-8114, USA; e-mail: harrison{at}csb.yale.edu; fax: (509) 691-6906.
(RECEIVED September 12, 2000; FINAL REVISION January 18, 2001; ACCEPTED January 18, 2001)
Article and publication are at www.proteinscience.org/cgi/doi/10.1110/ps.38701.
| Abstract |
|---|
|
|
|---|
Keywords: Amyloid; propagation; prion; protein folding; simulation
| Introduction |
|---|
|
|
|---|
In amyloidoses, the N native soluble form of a protein undergoes a conformational change and assembles into an amyloid fibril. There appears to be a direct causal link between the formation of such amyloid fibrils and the pathogenesis of these diseases. For example, in Alzheimer's disease, A-ß peptide is metastable in its soluble state and becomes more structured, with ß-sheet forming on appearance of an amyloidogenic intermediate (Kelly 1996, Kelly 1998; Selkoe 1997). Lansbury and coworkers have shown that the Alzheimer's amyloid assembly process is modular, with at least two quaternary structural intermediates during amyloid filament assembly (Harper et al. 1997). However, it has also been shown that Alzheimer's amyloid can grow through the irreversible binding of monomers to the ends of a fibril (Lomakin et al. 1996). In contrast to Alzheimer's disease, the amyloidogenic intermediate of other amyloid disease proteins (such as in nonneuropathic lysozyme amyloidosis [Booth et al. 1997] or transthyretin-based systemic amyloidoses [Lai et al. 1997]) appears to be less structured than the N native state. Recently, amyloid fibrils have been made from proteins independent of any immediate disease context (Chiti et al. 1999; Jimenez et al. 1999).
A prion is an alternative propagatable conformation for a protein that refolds from the N native state. The prion particle forms under the appropriate conditions and, in a cellular or intact animal system, converts copies of the N native conformation of the protein to the alternative conformation. In the prion diseases of humans and other mammals, the cellular prion protein PrPC changes conformation to a disease-causing form PrPSc that is rich in ß-sheet (Pan et al. 1993). PrPSc is the obligatory and major component of the infectious prion particle (Prusiner 1982). The minimal size of the PrPSc particle appears to be a dimer, as deduced from ionizing radiation studies (Bellinger-Kawahara et al. 1988). Although PrPSc can assemble into amyloid-like rods following partial proteolytic degradation to form PrP(2730) (Nguyen et al. 1995), amyloid formation is not required for infectivity (Wille et al. 1996). The ability of prions to infect organisms and direct their self-propagation is a principal distinction between prions and other amyloidogenic particles. Three other prions have been discovered: [PSI] and [URE3] in the yeast Saccharomyces cerevisiae and [Het-s] in the fungus Podospora anserina (Lindquist 1997; Wickner 1997). Although the prion proteins are central actors in replication, additional cellular factors appear to play a role in the replication process, including `Protein X' in the case of PrP (Kaneko et al. 1997) and the chaperone Hsp104 for [PSI] (Chernoff et al. 1995). Strains of prions for PrPSc have been identified that have distinct disease incubation times and neurohistopathology (Harrison et al. 1997; Cohen and Prusiner 1998). A growing body of work indicates that the biological properties of up to eight different prion strains are encrypted in the PrPSc tertiary structure (Bessen et al. 1995; Telling et al 1996; Collinge et al. 1996; Safar et al. 1998). There is also evidence for distinct strains of the [PSI] prion (Derkatch et al. 1996).
Simple lattice models have helped in understanding many properties of protein folding and evolution (Leopold et al. 1992; Chan and Dill 1991, Chan and Dill 1994; Sali et al. 1994; Bryngelson et al. 1995; Li et al. 1996; Klimov and Thirumalai 1996; Dill and Chan 1997; Govindarajan and Golstein 1997; Bornberg-Bauer and Chan 1999; Buchler and Goldstein 1999; Pande and Rokshar 1999; Dinner and Karplus 1999). A number of lattice studies have investigated various aspects of misfolding or folding to alternative conformations. In a previous lattice model study, we showed that less stable proteins intrinsically have a greater chance of encrypting alternative native states as multimers (such as those that occur for prions) and that the hydrophobicity of a protein sequence has no bearing on the existence of an alternative native state (Harrison et al. 1999). Shakhnovich and coworkers have studied what happens when two alternative conformations occupy the ground state for a protein sequence (a scenario that may be relevant to prion formation; Abkevich et al. 1998). They found that, under denaturing conditions, the conformation with more local contacts is folded to first. Dinner and Karplus (1998) reported an example of a model protein chain for a lattice model that folds recurrently and consistently to a metastable native state. (Similar behavior has been simulated previously for an off-lattice model for the folding of a ß-barrel [Honeycutt and Thirumalai 1990].) They found that the barrier from the kinetic metastable native state to the thermodynamic lowest-energy state was chiefly entropic. There have been several lattice studies on aggregation in proteins (Broglia et al. 1998; Gupta et al. 1998; Istrail et al. 1999). Gupta et al. (1998) found that despite the simplicity of their two-dimensional (2D) model, aggregates tended to be formed from native protein contacts. Broglia et al. (1998) noted that aggregated states for two chains form from local contacts that occur early in the normal folding process between strongly interacting residues. Istrail et al. (1999) reported that the propensity to aggregate is not simply dependent on the sequence composition but also on the proportion of long runs of hydrophobic or hydrophilic residues in the sequence. A recent study (Giugliarelli et al. 2000) focused on a 2D superlattice of protein chains. It showed that the number of sequences that have compact, soluble native states is greatest at a residue interaction potential that gives protein-like hydrophobicities and for which the number of prion-forming sequences is of the same order.
How can a propagatable alternative conformation arise for a protein? To what extent can the features of prion propagation come together in a simple model of protein folding, without accessory factors? We attempt to address these questions in this article, using a simple model of protein folding. We report the first simulation of propagation of an alternative conformation using any protein model. This occurs without designing the alternative conformation. Intriguingly, the model shows some features of both amyloidogenesis and prion formation. It provides insights into possible mechanisms for propagation that may be particularly relevant to the non-PrP (yeast and fungus) prions.
| Results |
|---|
|
|
|---|
6% of random sequences. It is reminiscent of the stacking evident for type-1 pilus assembly in Escherichia coli (Choudhury et al. 1999). The packing enables stacking of multiple chains of the same conformation end to end. The R2 dimer configurations have one notable distinguishing feature: They adopt a more extended structure than other lowest-energy dimeric assemblies. (The R2 dimer assemblies are significantly more extended, as determined by a Mann-Whitney U-test [Hollander and Wolfe 1973; p = 0.013 that their more extended nature in the sample of 70 sequences is random]; see Materials and Methods.) An extended secondary structure is defined by two succeeding chain bonds that travel in the same direction on the lattice. Extended secondary structure is thus ß-sheet-like. The proportion of these extended secondary structures in the R2 assemblies is 0.70 (±0.13), compared to 0.36 (±0.14) in general. Amyloid and prions have extensive ß structure (Pan et al. 1993; Sunde et al. 1997; Taylor et al. 1999; King et al. 1997). The stacked ß-sheet-like R state that we have found may thus be the lattice analog of the amyloid protofilament, as its stacking occurs only along one axis of lattice space.
|
|
i + 3, i + 2
i+5, and so on (basically the definition used previously by Chan and Dill 1991). We did not find any R states for our initial sample of 70 sequences that had helical structure. A previous study showed that aggregatability relies on the composition of a sequence in terms of `runs' of hydrophobic and hydrpophilic residues (Istrail et al. 1999). In addition, we examined the sequences with an R state that we found for any obvious trends in sequence composition but found no such trends.
Two sequences studied for propagation
To investigate propagation, extensive folding and refolding simulations were performed for a model protein sequence that has a lowest-energy R2 dimer in its ground state arising from the dimeric enumerations. (The proportion of extended structure for this R2 dimer of the model protein sequence is 0.71.) A much smaller number of simulations was performed for a random sequence. The construction of these two sequences is described below in Materials and Methods. The N native states of both the model protein and random sequences are illustrated in Figures 2a and 3a![]()
). These are the native states at infinite dilution. The encoded N state of the model protein (Fig. 2a
) has an obvious hydrophobic core.
|
![]() | (1) |
![]() | (2) |
![]() | (3) |
We also consider the protein folding reaction to the N state starting from a randomly chosen unfolded conformation. This can be considered as protein folding at infinite dilution.
The parameter that was varied in our simulations is the interresidue interaction strength (
/T), as in previous simulation work (Chan and Dill 1994, Chan and Dill 1998). Higher interaction strength (
/T) implies less denaturing conditions (lower denaturant concentration). We used two interaction strengths (
/T) in simulations for the model protein: one where the N state
GNfold for folding is
0.0 (fractional population f of the N state for normal folding at infinite dilution is 0.5), and the other where the N state would normally be denatured (f = 0.2). These are denoted (
/T)f = 0.5 and (
/T)f = 0.2, respectively.
Lowest-energy states for sequences from simulations
We established the complete dimeric ground states of our two sequences from an examination of all of the refolding trajectories involving two chains. We have not found any assemblies of lower energy than the R-state assemblies for each sequence either for two or for three chains. For the model protein sequence, the ground-state conformations (interaction energy (E) = -87) for the dimeric free-energy surface are the self-similar R2 dimer (Fig. 2b
) and another dimeric assembly that has one chain in the N state (Fig. 2c
). For the random sequence (E = -71), there are three ground-state dimeric assemblies (Fig. 3bd
). For both examples, the interaction energy at one interface for the R conformation is
40% of the total in the R2 dimer (Einterface = -33 for the model protein and -27 for the random sequence). The ground-state conformations for the three-chain simulations were determined in the same manner and are also shown (Fig. 2e
, Fig. 3ef
; total energy E = -147 for the model protein sequence, and E = -120 for the random sequence).
What happens in the dimeric refolding trajectories?
Having obtained the dimeric ground-state conformation(s), we examined the trajectories for dimeric refolding in more detail at (
/T)f = 0.5 = 0.99 for the model protein. A trajectory for the refolding reaction 2N
R2 is illustrated for the model protein sequence at this
/T value (Fig. 4ad
). The average behavior of each variable for all trajectories is inset in each panel of the figure. As noted in Materials and Methods, the simulation protocol for the refolding reactions, in some sense, mimics a molecular crowding effect, for example, as might occur for GPI-anchored proteins in rafts (Simons and Ikonen 1997). The behavior of four variables over the course of the trajectory was monitored. These are the number of contacts in common with the N state (denoted CN): the total energy for the system, the total number of contacts, and the total number of contacts in common with the interface of the R state (Fig. 4ad
). A good indicator of the progress of the refolding reaction is clearly the number of contacts in common with the interface of the R state (which is denoted CRinter). The mean value of CRinter during the simulations before conversion to R2 is 0.6 (±1.7) and is 6.0 (±1.7) afterward (this is the value of CRinter to which all of the simulations eventually converge [Fig. 4d
]). In the illustrated example (Fig. 4
), there are three distinct periods that occur: First, the two chains quickly find the lowest-energy assembly of the normal N state (denoted 2Nlowest; Fig. 4a
). This has an energy of -86 dimensionless units: one higher than for the R-state dimer assembly (Fig. 2f
). This first period is characterized by a lack of fluctuation in all four variables (Fig. 4
). Second, there is a transient period of substantial unfolding and disassembly of the two chains in which higher-energy species are sampled, which is followed by a third period of intermediate energy fluctuation in which the R-state dimer is found. This mechanism of disassembly and unfolding is common to most of the trajectories (5/6 trajectories) that first get stuck in the 2Nlowest configuration. This refolding process bears some analogy to the folding process observed for the folding of smaller HP-model proteins (Chan and Dill 1994), with the 2N state as an apparent kinetically trapped intermediate similar to those predicted for single-chain folding (Bryngelson et al. 1995; Bryngelson and Wolynes 1987, Bryngelson and Wolynes 1989; Thirumalai and Woodson 1996). The average behavior (over all trajectories) of the total number of contacts and the total energy indicates a rapid (within 625,000,000 moves) convergence to sampling high-contact low-energy assemblies (Fig. 4bc
). However, these retain substantial similarity to the N state until
1.25 x 1011 moves have elapsed (Fig. 4a
), whereafter the simulations gradually converge to a value of CN midway between 8 (= CN for the R2 dimer) and 11 (= CN for its partner ground-state assembly).
|
/T)f = 0.5 = 0.99 for two chains (Table 1
R2). Importantly, most (27 of 33) of these successful trajectories do not encounter the 2Nlowest configuration. The lowest-energy assembly of the N state has none of the interface contacts that occur for the R state, and any assembly of the N state has at most two such contacts. This suggests that selection for stable assemblies of the normal native state may be a design factor that could help to avoid this misfolding. However, as the model protein sequence already has a very low energy packing of its N conformation (2Nlowest, E = -86), this can be expected to be a minor effect.
|
15% at (
/T)f = 0.5 = 0.99. All trajectories that encounter one ground-state assembly also find the other. This interconvertibility is lost once the R conformation is propagated to a third chain to form an R3 assembly (see below).
Conversion efficiency and conversion time for dimeric refolding reaction (2N
R2)
The conversion efficiency (i.e., the proportion of successful refoldings) and the mean first conversion time for the refolding of two chains were monitored for the model protein at two values of (
/T) (Table 2
). At (
/T)f = 0.2, there is ready formation of the minimal R-state dimer. Under these more denaturing conditions, initial R-state dimer formation is
15 times slower than the monomeric folding process to the N state. Indeed, the simulations reach equilibrium with the R state reverting and re-converting 1022 times over the course of the 50 1010-move trajectories. At (
/T)f = 0.5, where the
GNfold for the normal folding process to the N state is
0.0, there is no such reversion to the initial 2N state. At this higher interaction strength, only
58% (33 of 57) of the simulations lead to R-state dimer formation (Table 2
). From those successful simulations, we can deduce that the mean conversion time is at least
7.5 x 108, only moderately slower than folding to the N state (
10-fold).
|
R2 were calculated from the simulations at (
/T)f = 0.2. From a simple ratio of the mean reversion time to the mean conversion time for the 1022 reversions (R2
2N) and re-conversions (2N
R2) observed, we can calculate
G(2N
R2) = -0.13 at (
/T)f = 0.2 = 0.71. This favors the R2 dimer assembly under the simulation conditions.
Propagation of the R conformation
Does the alternative low-energy conformation propagate? To address this question for the model protein, we performed a set of simulations for three chains, with two chains assembled as an R2 dimer initially and a third chain in the N conformation in contact with them (Table 2
). These are templated simulations. As a comparison, we also performed simulations that started with three chains in the N state (spontaneous simulations; Table 2
). An R2 dimer converts a third chain from the N state to the same R conformation with complete fidelity under the more denaturing conditions at (
/T)f = 0.2 (50 of 50 simulations). This templated conversion is about five times slower than (2N
R2) dimer formation but is about five times faster than the corresponding spontaneous process starting from 3N (Table 2
). The effects of excluded volume or excessive molecular crowding are evident in two ways: First, the rate to dimer formation in the spontaneous three-chain simulations (starting from 3N) is
15 times slower with a third chain initially in direct contact. Second, the (3N
R3) process is not the sum of the (2N
R2) and (N + R2
R3) processes at (
/T)f = 0.2 (Table 2
).
We found that the existence of a two-faced mode of packing in the ground state as a dimer is sufficient to produce such propagation at a stronger interaction strength where the
GNfold of normal N-state folding is marginal (
0.0). At (
/T)f = 0.5, coopting a third chain to the R conformation has comparable efficiency and rate of refolding to initial R-dimer formation (2N
R2). The corresponding spontaneous process is rare (2 of 42 simulations), although 9 out of 42 of these trajectories reach R2 formation (Table 2
). None of the three-chain simulations revert to the original normal native conditions at the stronger interaction strength, (
/T)f = 0.5.
This process of propagation to a third chain is autocatalytic in a simple sense (i.e., the conversion time to a stable R conformation is infinite in the absence of the R2 template). Also, the corresponding spontaneous refolding process for three chains is slower at both
/T values considered. From our limited simulations here at two
/T values, it seems that lower
/T values (corresponding to more denaturing conditions) make the R2 dimer formation process and the propagation process to a third chain faster. This is consistent with general observations for a wide variety of amyloidogenesis mechanisms (e.g., lysozyme-based amyloidogenesis [Booth et al. 1997]).
Sequence mutant studies
We performed simulations to test whether four single-site mutant sequences for the model protein that do not have the R2 dimer in their dimer ground state can propagate the R conformation to a third chain (Table 3
). The residue positions at which these sequences differ from the original sequence are indicated in the table. Twenty propagation simulations (process 2; N + R2
R3) were performed for each sequence at (
/T) = 0.99. Three of the mutants do not propagate, and the fourth mutant propagates inefficiently (Table 3
). The two propagating sequences (the original sequence and mutant 2) both have the R3 assembly in their three-chain ground state. However, what distinguishes the inefficient propagator is the lack of the R2 dimer in its two-chain ground state. These results suggest that the R2 dimer should be in the dimeric ground state to enable efficient propagation of the R conformation to a third chain.
|
/T)f = 0.5 reveals an example of domain swapping. In domain swapping, a segment of one protein chain is replaced by the corresponding segment of a second protein chain (Bennett et al. 1995). This swapping may either be symmetric (as is that, e.g., recently described for nitric oxide synthase [Crane et al. 1999]) or serial (as has been suggested for protein aggregation [Bennett et al. 1995]). Domain swapping has also been suggested as a mechanism for the distinction of prion strain conformations (Cohen and Prusiner 1998).
Here, serial domain swapping occurs between multiple chains in an interlocking manner (Fig. 2e
). The R-state assembly for the model protein sequence has internal energy E = -87 for two chains and E = -147 for three chains. A serially domain-swapped variant of the R state arises for the same sequence that has E = -86 and E = -146 for two and three chains, respectively (Fig. 2e
; denoted R2DS and R3DS, respectively). This configuration has the same interface contacts (eight in number) as the R2 dimer, plus an additional interchain contact that corresponds to an intrachain contact in R2. This gives a total of nine interface contacts, the highest number for a dimeric assembly in the simulations here. (It is theoretically possible for two chains to have up to 16 interfacial contacts between each other. However, these high-energy configurations are not sampled under our effectively high-concentration conditions and are energetically much less stable.) For two and three chains, the R-conformation assembly and its domain-swapped counterpart interconvert readily at (
/T)f = 0.2. For two chains at (
/T)f = 0.5, every simulation that finds the ground-state dimer conformations also finds the R2DS configuration. Intriguingly, R2DS is almost always found via the other (nonpropagating) ground-state dimer assembly (95% of trajectories). However, interconversion between the two forms is not observed at the higher interaction strength, ([
/T]f = 0.5) for three chains over the course of any trajectory.
This domain swapping in the simulations arises logically as a theoretical analogy of the prion-strain phenomenon (see Discussion).
Limited simulations for the random sequence
To verify that the basic premise of our investigation is not peculiar to the model protein sequence chosen, we performed a smaller number of simulations for a random sequence (for processes [13] listed above). Simulations were performed at (
/T)f = 0.10 = 0.99 for the random sequence (this is (
/T)f = 0.50 for the model protein sequence). We feel it is more appropriate to compare the sequences for this value of
/T, as at this
/T the better-designed model protein is marginally stable (as most real proteins are marginally stable, this would reflect a dominant environment in the cell). With the random sequence, R2 dimer formation is observed (24 of 25 simulations), as well as propagation to a third chain to produce an R3 assembly (15 of 22 simulations). Conversion thus occurs more readily for this random sequence than for the model protein sequence. Less-stable or destabilized sequences for many amyloidogenic proteins are more susceptible to amyloid formation (e.g., Booth et al. 1997) and are also predicted to be intrinsically more likely to encrypt an alternative multimeric ground-state conformation (Harrison et al. 1999). This result suggests that they are also more susceptible kinetically. Simulation of (re)folding to the R state is not feasible to study at an interaction strength at which this random sequence folds stably ([
/T]f = 0.50 = 1.79), owing to time constraints.
Two equilibrium scenarios at two extremes of concentration
We examined the behavior of the fractional population of important conformations for the two extremes of concentration in the simulations: infinite dilution and high effective concentration of chains for the model protein sequence. The two sets of population curves indicate two very distinct equilibrium scenarios (Fig. 5
). The fractional population curves of the N-state and R-state conformations for protein folding at infinite dilution show that the R-state conformation is rare at equilibrium (Fig. 5a,b
) at any value of T/
(f < 0.0016; maximum f at T/
= 1.63). For our two-chain folding simulations, the equilibrium probability for the R2 dimer is <0.5, it shares the ground-state with another two-chain assembly (Fig. 5c
). The contribution of the domain-swapped variant of R2 to the overall R-state probability is small. The behavior of the population probability of any N dimer is also illustrated in Figure 5c
. Interestingly, the 2N state becomes more favored than the R state at low interaction strength, (T/
) > 1.54, indicating that the 2N state has a higher degree of configurational diversity. However, under such denaturing conditions, both states are disfavored compared with numerous more unfolded configurations.
|
) < 0.99 for the model protein sequence between the conformational space near the 2N state and the conformational space around the R2 state. This is demonstrated by the curve in Figure 5d
values is compared to the extrapolated values for the system at equilibrium. This curve bears some analogy to a hallmark feature discussed by Kauzmann (1948) for the enthalpy difference between a glassy state nN and a crystalline alternative state Rn. In analogy to Kauzmann's scenario of decreasing temperature, the Rn state is more stable but is increasingly more unlikely to be reached the more native the conditions (i.e., higher interaction strength, less denaturing conditions). The two curves in Figure 5
0.5).
A dimeric free-energy surface
We constructed a free-energy landscape or surface from trajectories for two chains at equilibrium (see Materials and Methods). The number of R-state interface contacts (CRinter) between the two chains in a dimer assembly, the most appropriate indicator of the progress of the refolding reaction, was used as one axis of the free-energy surface. The total number of contacts (Ctotal) is used as the orthogonal reaction coordinate to construct a contour plot (Fig. 6a
; (
/T)f = 0.5 = 0.99). Other surface coordinates are possible, but we have used the total number of contacts as this has previously been used by other workers (e.g., Dinner et al. 2000) and as monomeric folding reactions almost always finally progress toward more compact assemblies (Chan and Dill 1994, Chan and Dill 1998). Surfaces like this one that are based on thermodynamic reaction coordinates should not be overinterpreted kinetically (Chan and Dill 1998; Dinner and Karplus 1999). Nonetheless, they are useful for elucidating thermodynamic relationships among different states and suggest viable kinetic interpretation.
|
/T]
0.99), the profile is dominated by a small number of specific assemblies along the CRinter coordinate (Fig. 6a,b
/T (more denaturing conditions), these specific free-energy wells become less dominant. This is because less contribution to the free energy comes from the enthalpic interaction of two chains (e.g., at [
/T] = 0.25, the mean interaction enthalpy between two chains is -3.0 [±8.7]).
Figure 6b
indicates the trend in free energy and mean energy as a function of CRinter at a high
/T value (= 1.21). This profile corresponds to one axis of the free-energy surface. It further demonstrates the domination of specific dimer assemblies and also indicates that the conversion process cannot simply be explained by a drive toward greater compactness. At
GNfold
0.0 for the present system for the model protein sequence, the entropic component (configurational and conformational) of the free energy does not significantly contribute to the total free energy along this coordinate. Under such conditions that excessively favor the native state N at infinite dilution ([
/T] > 0.99), there is no refolding (from CRinter = 0 to CRinter = 8) over the course of our longest trajectories (1010 MC moves), indicating that the predominantly enthalpic refolding barriers are too high and too numerous to cross.
| Discussion |
|---|
|
|
|---|
6% of random sequences. They also allow the stacking of chains end to end. An example of such packing is observed in type 1 pilus assembly in E. coli (Choudhury et al. 1999). Amyloid and prions have extensive ß structure (Pan et al. 1993; King et al. 1997; Sunde et al. 1997; Taylor et al. 1999). A propagatable helical assembly is conceivable; e.g., a sequence could have an N state with the contact map ([first contacting residue, second contacting residue]: [1,4], [1,16], [3,6], [4,13], [5,8], [5,12], [9,12], [11,14], [13,16]). It could then have an R-state monomer comprising a single, long helix: (1,4), (3,6), (5,8), and so forth, although we did not find any R states that had helical content (they are presumably rarer). The possibility of such a helical propagating state arising in a larger sample of sequences could be a further topic of investigation for this model. The stacked ß-sheet-like R state is thus a lattice analog of the amyloid protofilament, as its stacking occurs only along one axis of lattice space. A dimeric R2 template can convert a third chain from the N state to form an R3 assembly in a catalytic fashion. (There is a rate enhancement in the conversion of a third chain to the R conformation by a dimer template, compared to the corresponding spontaneous process.) As the rate of decay for the R3 assembly is far less than the rate of conversion to it, we expect that the R conformation can further propagate to Rn assemblies, forming amyloid-like filaments. The only likely propagatable alternative conformations (i.e., that have two binding faces) that we have found in this model are ß-sheet rich. This indicates that ß-sheet-rich amyloid- and prion-like alternative conformations are intrinsic to encoded structures and that other forms are unlikely. They can arise for a contact-based energy function without any explicit consideration of separate hydrophobic and hydrogen-bonding terms in the energy function. Also, for some amyloidogenesis/prion formation mechanisms, there might be extensive ß structure simply because the lowest-energy propagating assemblies accessible for a small number of chains are among the most extended assemblies possible.
Prion-like features and implications
Although our simulations do not reproduce a replicated particle, they do nonetheless, have a number of key prion-like features. In addition to the ß-sheet-like content of the alternative form (that is observed for PrP and the two yeast prions [Pan et al. 1993; King et al. 1997; Taylor et al. 1999]), a minimal alternative conformation of an R2 dimer is observed. This was observed for PrPSc by ionizing radiation studies (Bellinger-Kawahara et al. 1988). In the present simulations, the R monomer is a rare equilibrium unfolding intermediate under dilute conditions. For PrP, although there is some evidence that a mildly protease-resistant ß-sheet-containing form is observable at acid pH (Jackson et al. 1999), other evidence suggests that spontaneous conversion to PrPSc is coupled with dimer formation at neutral pH (but not multimer formation of a higher order; Post et al. 1998). Such a coupling of conversion and dimer formation is observed here.
Although a single move to detach a chain from an Rn assembly would be very unlikely in the present simulations, disintegration of an Rn assembly (of greater than three chains) by a combination of moves may arise in simulations of longer duration. As the interaction enthalpy required for conversion of a third chain by a dimer (here 40% of the total individual chain interactions) is too much to enable ready subsequent release of the converted chain, our model predicts that extra factors (such as Protein X in PrP prion propagation; Kaneko et al. 1997) should play a role in facilitating detachment. Assessment of the rates and mechanisms of Rn polymer disintegration is beyond the scope of this article. In a more extensive study, for other model examples not studied here, the interfacial binding energy per additional R unit may be much less than in the present examples and, thus, more likely to give a distribution of Rn polymer sizes (as predicted by Masel et al. 1999). If the binding enthalpy per additional chain were less, the resulting higher rate of detachment for longer simulations under denaturing conditions might produce replication-competent particles. Such a mechanism might be more applicable to the non-PrP prions. (For PrP specifically, PrP prion infectivity is separable from amyloid formation in prion disease [Wille et al. 1996].)
To what extent can an analog of the prion strain phenomenon arise for a protein without the need for accessory factors? Distinct strain conformations of PrPSc are linked to the different prion disease histopathologies and incubation times (Cohen and Prusiner 1998). They have different proteinase-K resistant fragment sizes and antibody binding profiles (Collinge et al. 1996; Telling et al. 1996; Safar et al. 1998). Strain conformations may also occur for the [PSI] yeast prion (Derkatch et al. 1996). There is recent evidence that different modes of metal ion binding are associated with PrP prion strain conformations (Wadsworth et al. 1999). Also, different strains may have different glycosylation patterns (Collinge et al. 1996). Domain swapping has been suggested as a way to obtain different prion strain conformations (Cohen and Prusiner 1998). We have observed a domain-swapped variant (named RnDS above) of the Rn assembly for either two or three chains for the model protein sequence. Its interaction energy is only one dimensionless unit higher than that of the lowest-energy Rn assemblies for two and three chains. Although it is not in the ground state, it remains propagatable because of its relationship with the propagatable Rn assembly. This domain swapping arises logically as a theoretical analogy of the prion strain phenomenon here. Our observation suggests that strain conformations may be a general property of polymeric aggregates, including amyloid. The difference in the interconversion behavior for the two-chain and three-chain simulations (under less denaturing conditions) indicates that the likelihood of interconversion between an Rn assembly and its domain-swapped variant will decrease with increasing value of n, with the domain-swapped form as the minor strain variant needing a larger inoculum of chains to propagate its configuration. For the random sequence, there are no propagatable variants of the Rn assembly.
A number of other features in the present model are notable as amyloid-like. Conversion occurs more readily under more denaturing conditions for the monomer, as for many amyloidogenesis mechanisms (e.g., lysozyme-based amyloidosis; Funahashi et al. 1996; Booth et al. 1997). The propagation procedure is essentially irreversible once a large enough assembly (here, three chains) is attained. However, there does not appear to be a well-defined monomeric amyloidogenic intermediately before dimer formation in the present simulations, as described for some amyloidogenesis mechanisms (Kelly 1998).
With the present limited interaction scheme, our model does not have any explicit scoring term for hydrogen bonding, although one can define secondary structure in the conformations of the heteropolymers (see above and Chan and Dill 1991; Li et al. 1996). Our results may indicate that hydrogen bonding per se is not the underlying factor for an amyloid state for heteropolymers. In some observed amyloid fibrils, experimental data indicate that there is substantial stabilization of the fibril from hydrophobic interactions. In transthyretin fibrils, for example, a model is implied by high-angle X-ray diffraction data that has four ß-sheets packed against each other with hydrophobic interactions orthogonal to the hydrogen bonding (Blake and Serpell 1996; Sunde et al. 1997). Such a fibril model comprising four to six ß-sheets also agrees with X-ray data for five other amyloids (Sunde et al. 1997).
The free-energy surface for dimer refolding to a propagatable state
The free-energy surface for dimer formation was characterized under conditions of molecular crowding at high effective concentration. It indicates a split (re)folding landscape leading at either extreme to an alternative multimeric minimum (termed the R state) or assemblies of the N state. Starting from 2N, the chains sample various different assemblies of the N state and fluctuations of them before conversion to the R2 configuration occurs (they are restricted to a small region of the free-energy surface). Refolding from the lowest-energy assembly of the N state (2Nlowest) to the R2 state usually requires transient unfolding and disassembly of the two chains. Refolding to the R2 state is more successful if 2Nlowest is avoided. The 2N state is an apparent kinetically trapped intermediate similar to those predicted for single-chain folding (Bryngelson et al. 1995). This refolding behavior is similar to the folding process observed for smaller HP-model proteins (Chan and Dill 1994).
A key point here is that, at high
/T (less denaturing conditions) where the normal native state is at least marginally stable, the dimeric refolding observed is dominated by specific assemblies (as described in detail above and in Fig. 6
). There appears to be some entropic contribution to the barrier for conversion to R2 that arises from the fewer extended assemblies of two chains that are accessible from the R2 configuration. This is discernible from the increased slope of the free-energy surface as CRinter increases. However, the same low points are observed along the CRinter reaction coordinate for the mean energy as for the free energy, indicating domination by 2Nlowest, R2, and a small number of other two-chain assemblies.
Sequence design issues for the model protein
Small proteins fold on a time scale of seconds or less (Fedorov and Baldwin 1999). Many troublesome misfolders are small protein chains or peptides (Kelly 1998). Suppose we assume that our folding time to the N state is physiological; that is, the mean first passage time is equivalent to an in vivo folding timescale of
10 sec. Then, at the higher interaction strength (
/T)f = 0.5 = 0.99, up to
58% of the population of molecules could refold to the R state in
100 sec, a time scale short enough to be physiologically relevant. This gives a lower bound for the dimeric conversion process. Excessive molecular crowding will tend to lengthen this time scale by an order of magnitude (this is discernible from the spontaneous propagation simulations involving three chains). This nonetheless strongly suggests that for some (but not most) small proteins or peptides that must function under conditions of high concentration (e.g., in GPI-anchored rafts, as in the case of PrP), there must be a strong selection pressure to design out the tendency to detrimental R-state formation that results in a sequence signal that retards R-state formation or in speedy clearance mechanisms for removing R-state precursors or aggregates.
We have shown that a solution of the normal native nN state for a protein can be considered as a glassy state and the Rn state as the crystalline form, that becomes increasingly more difficult to relax to, the less denaturing the conditions. This is evident in our model system from the behavior of the mean energy as a function of
/T (Fig. 5d
). However, this phenomenon does not suffice to prevent conversion to a propagatable alternative native state at the
/T value where the N state would be marginally stable at infinite dilution or under more denaturing conditions.
One sequence design factor that would retard formation of the R2 state would be a mutation that lifts the R2 dimer out of the two-chain ground state. Simulations on single-site sequence mutants of our model protein sequence suggest that the R2 dimer must be in the dimeric ground state to enable efficient propagation of the R conformation. These mutation experiments indicate a thermodynamic component in amyloidogenic/prion-promoting sequence mutations: Where the mutant sequence is rarer for its encoded protein (and there is thus less evolutionary pressure on designing a sufficient barrier to conversion), the mutation may primarily act by pulling the R2 assembly down into the dimeric ground state. Such details of our simulations will help in designing simulation of propagation with a more detailed model, such as one based on polypeptide geometry, either on- or off-lattice. This mutation mechanism may be relevant to PrP as an additional factor in prion disease promotion, as not all PrP prion-disease-causing mutations have been shown to be destabilizing to the PrPC form (Liemann and Glockshuber 1999). Also, the fact that a single-site mutant can drastically change the possible lowest-energy aggregate morphology in our model parallels an experiment on a single-site mutant of the immunoglobulin light chain in immunoglobulin light chain amyloidosis (Helms and Wetzel 1996).
Another possible sequence design factor that may help to retard the formation of propagatable alternative native states Rn might be to select for stable assemblies of the N state during evolution, in addition to selecting for a stable monomeric N state. However, the fact that a relatively low-energy assembly for the normal native N state exists for the model protein in our study (only +1 dimensionless energy unit relative to the R2 assembly) suggests that this would not be sufficient. In addition, assemblies of the N state that are too stable may also lead to solubility problems. Other design issues relate to the nature of the dimeric ground state. Our examples (both random and model-protein sequences) show that one can have competing configurations in the dimeric ground state yet still have a propagating R state maintained. For the model protein sequence, the fact that one of the chains in this alternative ground-state assembly is in the N conformation shows that this does not suffice to prevent propagation of an R state.
We found that the random sequence, which is much more denatured at (
/T) = 0.99, converts more readily to the R state. Less stable or destabilized sequences for many amyloidogenic proteins are more susceptible to amyloid formation (e.g., lysozyme-based amyloidosis; Funahashi et al. 1996; Booth et al. 1997) and are also predicted to be intrinsically more likely to encrypt an alternative multimeric ground-state conformation (Harrison et al. 1999). This result suggests that they are also more susceptible kinetically.
| Materials and methods |
|---|
|
|
|---|
(
> 0; Table 1
We set out to generate a model protein to study. Out of the initial random sample of sequences, a further selection of 70 random sequences was chosen that have a unique ground-state conformation. From this set of sequences, we picked one native-state conformation randomly and generated a variety of 20 sequences for it using the sequence design procedure of Shakhnovich and coworkers (Abkevich et al. 1998). This procedure may be viewed as mimicking the pressures of natural selection. For each of these sequences, we performed a further set of enumerations to find low-energy homodimeric assemblies. All possible assemblies that comprise a copy of each possible conformation docked against itself were enumerated. We examined the homodimer enumerations to look for modes of packing that give two binding faces per chain (Fig. 1
). This type of assembly is termed an R state. We found one sequence that had the mode of packing in Figure 1a
in its dimer ground state. This is the model protein sequence in our study. Its sequence is AHHABPBHHBHHABHH. Its encoded structure has an obvious hydrophobic core (Fig. 2a
).
In addition, we performed these homodimeric enumerations for the complete set of 70 random sequences. We found that 4 of the 70 (6%) random sequences have the sort of two-faced packing, as indicated in Figure 1a
. Other modes of two-faced packing (such as that in Fig. 1b
) are conceivable but were not observed in the enumerations and so are not studied here. We also chose one of these four sequences to study (denoted random sequence). Its sequence is BHPPPPHBAPPPPHHA.
Folding simulation protocol
The MC moves considered in the (re)folding dynamics simulations are crankshafts, end-flips, corner-flips, and pivots (or `rigid-body rotations') as described previously (Chan and Dill 1994). Because our goal is to deduce general coarse-grained principles rather than to provide detailed predictions, for simplicity, all moves are assigned equal probability, as in previous studies (Chan and Dill 1994, Chan and Dill 1998). We have designed a multichain kinetic model to mimic physiological molecular crowding, which allows for the physical possibility that even in a crowded environment chains can sometimes be detached while undergoing kinetic changes in close proximity to other chains. In our multichain simulations, chains are initialized in randomly chosen configurations such that each chain has at least one contact with one other chain. In addition, in these multichain simulations, a small fraction (5%) of all attempted moves are translational, whereby the entire chain is rigidly displaced one lattice spacing in one of the four possible directions (as in Gupta et al. 1998). Hence chains can diffuse together or drift apart. To model molecular crowding, after every 1 x 106 attempted MC moves (i.e., on average every 5 x 104 attempted translation moves per chain), every chain is recentered to a region near the origin of the simulation box (where the simulation starts). Chains are recentered en masse with their relative positions maintained if they are in contact, whereas chains that are not in contact are assigned randomly to be in contact with at least one other chain. This recentering algorithm ensures that chains would not drift too far apart and would always have ample opportunities to interact with one another.
The standard Metropolis MC criterion (Metropolis et al. 1953) is used to accept or reject moves. The Boltzmann constant is set to unity for simplicity. Fifty simulations are studied for each sequence for each process unless otherwise stated in the text. Folding to the N state at infinite dilution is studied for up to 5 x 108 attempted moves. These simulations are started from a randomly chosen conformation. Time to fold initially from the unfolded state to the N state is termed the mean first passage time. Refolding from the N state is simulated for 1010 attempted moves. Time to refold initially from the N state to the R state is called the mean first conversio