|
|
||||||||
International School for Advanced Studies (SISSA/ISAS), I-34014 Trieste, ITALY
Reprint requests to: Cristian Micheletti, International School for Advanced Studies (SISSA/ISAS), Via Beirut 2A, I-34014 Trieste, Italy; e-mail: michelet{at}sissa.it; fax: +39-040-3787528
(RECEIVED August 9, 2001; FINAL REVISION April 4, 2002; ACCEPTED May 8, 2002)
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.3360102.
| Abstract |
|---|
|
|
|---|
Keywords: Protein-folding modeling; prediction of key folding sites; HIV-1 protease; drug resistance
| Introduction |
|---|
|
|
|---|
In previous studies (Cecconi et al. 2001; Settanni et al. 2001), we have shown how the most delicate folding stages can be identified within a molecular dynamics approach, by monitoring the formation probability of native and nonnative contacts from the unfolded to the native state. This can either be done as a function of time at a fixed temperature around the folding temperature or working at thermal equilibrium for a succession of decreasing temperatures (annealing). In principle, the two approaches need not be equivalent but, for the quantities we have investigated, they give consistent results. Then, concerning the identification of crucial contacts, one can safely concentrate on studying thermodynamic equilibrium at various temperatures. The main limitation of molecular dynamics (MD) and Monte-Carlo (MC) simulations, especially for long protein chains, is that they are extremely time demanding and plagued with statistical errors that can affect the predictions based on the study of the relative sensitivity of contact formation. Therefore it would be highly desirable to develop a suitable theoretical model, amenable to a deterministic (and computationally fast) treatment, thus resulting in a deeper understanding of the problem. Ideally, such a model should encompass all the "necessary ingredients" that usually are included in computer simulations: peptide-chain constraints, effective interactions between residues, favorable monomeric positions, and so forth. In the following, we describe a recently developed theoretical scheme (Micheletti et al. 2001a), that, while being very simplified and approximate compared to other schemes based on MD or MC simulations, can be treated analytically, leading to expressions that can be evaluated exactly. The calculated quantities rival those obtained through more sophisticated but computationally demanding MC and MD techniques. The purpose of this paper is to show how the model can be employed to yield helpful observables to identify the folding bottlenecks. In particular, we apply the method to the human immunodeficiency virus type 1 protease (HIV-1 PR), an enzyme that is crucially involved in the HIV infection (Condra et al. 1995). In general, the accurate knowledge of bottlenecks has important pharmaceutical ramifications because their knowledge may be exploited in a rational drug design. Because of the large amount of available clinical data, HIV-1 PR is a natural choice for a stringent test of our automated predictive scheme.
| Theory |
|---|
|
|
|---|
We describe the proteins by the coordinates ri of the C
atom of the i-th amino acids. The simplified energy functional for the chain of N residues is
![]() | ((1)) |
The relative position between amino-acid centroids is denoted by rij = ri-rj and the corresponding native positions are indicated with the superscript 0.
is the contact matrix, whose element
ij is 1 if residues i and j are in contact in the native state (i.e., their C
separation is below the cutoff c = 6.5 Å) and 0 otherwise. The matrix
ij along with the set r0ij encodes the topology of the protein. The factor
ij has the form
![]() | ((2)) |
(X) is the unitary step function and R is a distance cutoff defining the range of the interaction between nonconsecutive amino acids. In standard off-lattice approaches, the interaction V(d) between nonbonded amino acids at a distance d, is taken to be a square-well potential, or some type of Lennard-Jones interaction. Our choice in equation 1
ij is close to 1, while in the denaturated state, cases usually are negligible. While the present form of the model does not accurately describe the effects of self-avoidance, this does not lead to a qualitatively wrong behavior in the highly denatured ensemble (large T ). The treatment of steric effects becomes progressively more accurate as temperature is lowered. In fact, the model guarantees that the native state is the true ground state, and therefore protein conformations found at low temperature inherit the native self-avoidance. The connectedness of the chain, as well as its entropy, are captured in a simple but nontrivial manner. The most significant advantage of the model is that it can be used to explore the equilibrium thermodynamics without being hampered by inaccurate or sluggish dynamics.
Two limit cases of the model described by equation 1
are worthy of notice. In the absence of any bias towards the target structure (i.e., when both
ij and the {r0}'s are removed) the model reduces to the standard Gaussian polymer model whose behavior is exactly known (Flory 1956; Kloczkowski and Jernigan 1999). Furthermore, the limit when T
0 (when all native contacts are established and the bonded-energy term fluctuations are negligible) the model reduces to the Gaussian network model that has been introduced and used to study the near-native vibrational properties of several proteins (Bahar et al. 1997, 1999; Keskin et al. 2000; Atilgan et al. 2001).
The thermodynamics of the model are fully determined by the partition function
![]() | ((3)) |
In the integral of equation 3
and in the following, it is always meant that translational invariance is explicitly broken by fixing, for example, the center of mass of the system (see Appendix).
The integral (3) is still hard to treat analytically, because of the presence of nonquadratic interactions in the last term of Hamiltonian (1). We thus perform a further, but nontrivial, simplification by replacing H with the variational Hamiltonian H0
![]() | ((4)) |
ij are now substituted by parameters independent of the coordinates. Because of its quadratic form, the model described by equation 4
![]() | ((5)) |
. . .
0 indicates that the thermal averages are performed through the Hamiltonian H0. In such self-consistent approach, the problem is fully solved and we can compute the resulting partition function from which we extract all the thermal properties and averages. In particular, the logarithm of the partition function Z has the following explicit expression:
![]() | ((6)) |
![]() | ((7)) |
The quantities pij in equation 5
represent precisely the occurrence probability of a contact between residues i and j and indicate the frequency with which that native contact is established. At thermal equilibrium, their dependence on temperature reflects the status of compactness of the protein molecule. For instance, well below the folding temperature, Tf, each pij is expected to assume a value close to unity, as all native contacts are already formed. Instead, for temperatures much larger than Tf, all pij(T) tend to be very small, reflecting the low propensity of the protein to establish contacts. Thermodynamics quantities can be easily derived from the pij's. Another quantity necessary to characterize the folding transition is the specific heat, which exhibits one or more peaks in correspondence of significant structural rearrangements of the protein conformation. Because every energy change is mainly associated to the formation of native interactions, we address the question of which native contacts contribute mainly to the peak(s) of the specific heat. A clear answer to this question is found readily in the temperature behavior of frequencies pij. Indeed, each pij(T) exhibits a sigmoidal dependence of temperature, and the modulus of its temperature derivative develops a sharp maximum in correspondence to the point of inflection (crossover temperature). The importance of every native contact ij turns out to be characterized by the crossover temperature and the maximum slope of its pij, which can be regarded as an indicator of its degree of cooperativity. In fact, the most important contacts are those with high crossover temperature and associated high cooperativity. This fact allows a complete identification and classification of the bottlenecks, because we are now able to identify those contacts that are thermodynamically relevant to peaks and shoulders of the specific heat.
| Application to HIV-1 protease |
|---|
|
|
|---|
Indeed, mutants resistant to protease inhibitors can emerge in vivo after <1 year (Condra et al. 1995). Table 1
summarizes the list of HIV-1 PR known mutating sites causing drug resistance.
|
In particular, we will be concerned with the characterization of such an ensemble near the folding transition temperature. The motivation to do so stems from a recent study (Cecconi et al. 2001) where we have shown that such mutating amino acids correspond, with high statistical significance, to sites involved in the folding kinetic bottlenecks. The rationale for this finding is that the most effective drugs can be eluded only by mutations occurring in correspondence of the key sites. Because of the sensitivity of the folded native conformation to these sites, only fine-tuned mutations are allowed in correspondence to these sites. Such mutations have to result in a native-like enzymatic activity and in the avoidance of the drug action. These constraints act as a severe selective pressure on the mutated proteases that the HIV virus is able to express. As a result, the mutations that ultimately will cause drug resistance are expected to occur in correspondence to the crucial sites. These residues are influenced heavily by the native topology and hence should display little dependence on the particular (effective) drug to be eluded.
It is therefore our purpose to apply the scheme introduced in the previous section and identify the key residues within our topology-based scheme. The method, being completely analytic, is free from statistical uncertainty, common to all MC and MD simulation methods, or from difficulty (as a result of spatial restraints) to reach the target native state below the folding temperature.
| Results and Discussion |
|---|
|
|
|---|
![]() | ((8)) |
![]() | ((9)) |
![]() | ((10)) |
|
|
|
|
|
|
|
Finally, we remark that the determination of the key contacts does not uniquely provide the key folding sites, as two sites are involved in each pairwise contact. This ambiguity can, in several cases, be resolved either by selecting those sites that take part in several crucial contacts, or by examining their distribution on the three-dimensional native structure for clues that may help breaking the ambiguity.
| Conclusions |
|---|
|
|
|---|
The proposed approach to identifying the crucial residues is quite general and ought to be useful in identifying the kinetic bottlenecks of other viral enzymes of pharmaceutical interest, thus aiding in the development of novel effective inhibitors. We expect to focus our future efforts on improving the present approach by taking into account the propensities of different amino acids to form contacting pairs. This limitation can be overcome by introducing physically viable (attractive) pairwise interactions (Maiorov and Crippen 1992; Sippl 1995; Seno et al. 1998; Miyazawa and Jernigan 1999; Micheletti et al. 2001b). In the present approach, this possibility was deliberately avoided to highlight the influence of the native-state topology alone on the kinetic bottlenecks, irrespective of the different chemical nature and strength of the effective amino-acid interactions. We expect that the inclusion of such effects, while not distorting the overall picture presented here, may change the relative strength of spatially close contacts. This may improve the agreement between Table 1
and Tables 2 and 3![]()
by resolving those cases were a site adjacent to a mutating one is selected.
| Appendix |
|---|
|
|
|---|
![]() | ((11)) |
jAij = 0, which amounts to say that the uniform vector, v1
N-1/2(1,1,1,1. . .,1) is an eigenvector of A with eigenvalue
1 = 0. We assume that H0 is invariant only for the simultaneous translation of all the coordinates, {xi}. In this case, all other eigenvalues, {
i>1} are strictly positive and the corresponding eigenvectors vi>1 are all orthogonal to the zero mode v1.
By rewriting the Dirac-
constraint as
![]() |
where
![]() | ((12)) |
![]() |
![]() |
| Acknowledgments |
|---|
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| References |
|---|
|
|
|---|
Alm, E. and Baker, D. 1999. Prediction of protein folding mechanisms from free energy landscapes derived from native structures. Proc. Natl. Acad. Sci. 96: 1130511310.
Atilgan, A.R., Durell, S.R., Jernigan, R.L., Demirel, M.C., Keskin, O., and Bahar, I. 2001. Anisotropy of fluctuation dynamics of proteins with an elastic network model. Biophys. J. 80: 505515.
Bahar, I., Atilgan, A.R., and Erman, B. 1997. Direct evaluation of thermal fluctuations in proteins using a single parameter harmonic potential. Folding and Design 2: 173181.[CrossRef][Medline]
Bahar, I., Erman, B., Jernigan, R.L., Atilgan, A.R., and Covell, D.G. 1999. Collective motions in HIV-1 reverse transcriptase: Examination of flexibility and enzyme function. J. Mol. Biol. 285: 10231037.[CrossRef][Medline]
Baker, D.A. 2000. Surprising simplicity to protein folding. Nature 405: 3942.[CrossRef][Medline]
Camacho, C.J. and Thirumalai, D. 1993. Kinetics and thermodynamics of folding in model proteins. Proc. Natl. Acad. Sci. 90: 63696372.
Camacho, C.J. and Thirumalai, D. 1995. Theoretical predictions of folding pathways by using the proximity rule, with applications to bovine pancreatic trypsin inhibitor. Proc. Natl. Acad. Sci. 92: 12771281.
Cecconi, F., Micheletti, C., Carloni, P., and Maritan, A. 2001. Molecular dynamics studies of HIV-1 protease: Drug resistance and folding pathways. Proteins: Structure Function and Genetics 43: 365372.
Chan, H.S. and Dill, K.A. 1990. The effects of internal constraints on the configurations of chain molecules. J. Chem. Phys. 92: 31183135.[CrossRef]
Chiti, F., Taddei, N., White, P.M., Bucciantini, M., Magherini, F., Stefani, M., and Dobson, C.M. 1999. Mutational analysis of acylphosphatase suggests the importance of topology and contact order in protein folding. Nat. Struct. Biol. 6: 10051009.[CrossRef][Medline]
Clementi, C., Nymeyer, H., and Onuchic, J.N. 2000. Topological and energetic factors: What determines the structural details of the transition state ensemble and `en-route' intermediates for protein folding? An investigation for small globular proteins. J. Mol. Biol. 298: 937953.[CrossRef][Medline]
Condra, J.H., Schleif, W.A., Blahy, O.M., Gabryelski, L.J., Graham, D.J., Quintero, J.C., Rhodes, A., Robbins, H.L., Roth, E., Shivaprakash, M., et al. 1995. In-vivo emergence of HIV-1 variants resistant to multiple protease inhibitors. Nature 374: 569571.[CrossRef][Medline]
Debe, D.A. and Goddard III, W.A. 1999. First principles prediction of protein folding rates. J. Mol. Biol. 294: 619625.[CrossRef][Medline]
Fersht, A.R. 1995. Optimization of rates of protein foldingthe nucleation condensation mechanism and its implications. Proc. Natl. Acad. Sci. 92: 1086910873.
Flory, P.J. 1956. Theory of elastic mechanisms in fibrous proteins. J. Am. Chem. Soc. 78: 52225235.[CrossRef]
Galzitskaya, O.V. and Finkelstein, A.V. 1999. A theoretical search for folding/unfolding nuclei in 3D protein structure. Proc. Natl. Acad. Sci. 96: 1129911304.
Go, N. and Scheraga, H.A. 1976. On the use of classical statistical mechanics in the treatment of polymer chain conformations. Macromolecules 9: 535542.[CrossRef]
Hoang, T.X. and Cieplak, M. 2000. Sequencing of folding events in go-type proteins. J. Chem. Phys. 113: 83198328.[CrossRef]
Jackson, S.E. 1998. How do small single-domain proteins fold? Folding and Design 3: R81R91.[CrossRef][Medline]
Jacobsen, H., Hanggi, M., Ott, M., Duncan, I.B., Owen, S., Andreoni, M., Vella, S., and Mous, J. 1996. In vivo resistance to a human immunodeficiency virus type 1 Proteinase inhibitor: Mutations, kinetics, and frequencies. J. Infect. Dis. 173: 1379 1387.[Medline]
Kaya, H. and Chan, H.S. 2000. Energetic components of cooperative protein folding. Phys. Rev. Lett. 85: 48234826.[CrossRef][Medline]
2001. Polymer principles of protein calorimetric two-state cooperativity. Proteins: Structure Function and Genetics 43: 523.[CrossRef]
Keskin, O., Bahar, I., and Jernigan, R.L. 2000. Proteins with similar architectures exhibit similar large-scale dynamic behavior. Biophys. J. 78: 20932106.
Kloczkowski, A. and Jernigan, R.L. 1999. Contacts between segments in the random-flight model of polymer chains. Comp. Theor. Pol. Sci. 9: 285294.[CrossRef]
Lazaridis, T. and Karplus, M. 1997. "New view" of protein folding reconciled with the old through multiple unfolding simulations. Science 278: 19281931.
Maiorov, V.N. and Crippen, G.M. 1992. Contact potential that recognizes the correct folding of globular proteins. J. Mol. Biol. 227: 876888.[CrossRef][Medline]
Maritan, A., Micheletti, C., and Banavar, J.R. 2000. Role of secondary motifs in fast folding polymers: A dynamical variational principle. Phys. Rev. Lett. 84: 30093012.[CrossRef][Medline]
Markowitz, M., Mo, H., Kempf, D.J., Norbeck, D.W., Bhat, T.N., Erickson, J.W., Ho, D.D. 1995. Selection and analysis of human immunodeficiency virus type 1 variants with increased resistance to ABT-538, a novel protease inhibitor. J. Virol. 69: 701706.[Abstract]
Martinez, J.C. and Serrano, L. 1999. The folding transition state between SH3 domains is conformationally restricted and evolutionarily conserved. Nat. Struct. Biol. 6: 10101016.[CrossRef][Medline]
Micheletti, C., Banavar, J.R., Maritan, A., and Seno, F. 1999. Protein structures and optimal folding from a geometrical variational principle. Phys. Rev. Lett. 82: 33723375.[CrossRef]
Micheletti, C., Banavar, J.R., and Maritan, A. 2001a. Protein conformations in equilibrium. Phys. Rev. Lett. 87: DOI:0881021.[CrossRef][Medline]
Micheletti, C., Seno, F., Banavar, J.R., and Maritan, A. 2001b. Learning effective amino acid interactions through iterative stochastic techniques. Proteins: Structure Function and Genetics 42: 422431.[CrossRef]
Miyazawa, S. and Jernigan, R.L. 1999. Residue-residue potentials with a favorable contact pair term an unfavorable high packing density term, for simulation and threading. J. Mol. Biol. 256: 623644.
Molla, A., Korneyeva, M., Gao, Q., Vasavanonda, S., Schipper, P.J., Mo, H.M., Markowitz, M., Chernyavskiy, T., Niu, P., Lyons, N., Hsu, A., Granneman, G.R., Ho, D.D., Boucher, C.A., Leonard, J.M., Norbeck, D.W., and Kempf, D.J. 1996. Ordered accumulation of mutations in HIV protease confers resistance to ritonavir. Nat. Med. 2: 760766.[CrossRef][Medline]
Patick, A.K., Mo, H., Markowitz, M., Appelt, K., Wu, B., Musick, L., Kalish, V., Kaldor, S., Reich, S., Ho, D., Webber, S. 1996. Antiviral and resistance studies of AG1343, an orally bioavailable inhibitor of human immunodeficiency virus protease. Antimicrob. Agents Chemother. 40: 292297.[Abstract]
Plaxco, K.W., Simons, K.T., and Baker, D. 1998. Contact order and transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 277: 985994.[CrossRef][Medline]
Reddy, P. and Ross, J. 1999. AmprenavirA protease inhibitor for the treatment of patients with HIV-1 infection. Formulary 34: 567675.
Riddle, D.S., Grantcharova, V.P., Santiago, J.V., Alm, E., Ruczinski, I., and Baker, D. 1998. Experiment and theory highlight role of native state topology in SH3 folding. Nat. Struct. Biol. 6: 10161024.
Sali, A., Shakhnovich, E., and Karplus, M. 1994. How does a protein fold. Nature 369: 248251.[CrossRef][Medline]
Seno, F., Micheletti, C., Maritan, A., and Banavar, J.R. 1998. Variational approach to protein design and extraction of interaction potentials. Phys. Rev. Lett. 81: 2172.[CrossRef]
Settanni, G., Cattaneo, C., and Maritan, A. 2001. Role of native state topology in the stabilization of intracellular antibodies. Biophys. J. 80: 29352945.
Sippl, M.J. 1995. Knowledge based potentials for proteins. Curr. Opin. Struct. Biol. 5: 229235.[CrossRef][Medline]
Tisdale, M., Myers, R.E., Maschera, B., Parry, N.R., Oliver, N.M., Blair, E.D. 1995. Cross-resistance analysis of human immunodeficiency virus type 1 variants individually selected for resistance to 5 different protease inhibitors. Antimicrob. Agents Chemother. 39: 17041710.[Abstract]
Wolynes, P.G., Onuchic, J.N., and Thirumalai, D. 1995. Navigating the folding routes. Science 267: 16191620.
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
C. Guardiani, F. Cecconi, and R. Livi Stability and Kinetic Properties of C5-Domain from Myosin Binding Protein C and its Mutants Biophys. J., February 15, 2008; 94(4): 1403 - 1411. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Cecconi, C. Guardiani, and R. Livi Testing Simplified Proteins Models of the hPin1 WW Domain Biophys. J., July 15, 2006; 91(2): 694 - 704. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.P. Pandey, C. Zhang, X. Yuan, J. Zi, and Y. Zhou Protein flexibility prediction by an all-atom mean-field statistical theory Protein Sci., July 1, 2005; 14(7): 1772 - 1777. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||