|
|
||||||||
1 Howard Hughes Medical Institute Center for Single Molecule Biophysics, Department of Physiology and Biophysics, State University of New York (SUNY) at Buffalo, Buffalo, New York 14214, USA
2 Surface Physics Laboratory (National Key Laboratory) and T-Center for Life Sciences, Fudan University, Shanghai 200433, China
3 School of Physics and Electronic Information, Wenzhou Normal College, Wenzhou 325027, China
4 T.D. Lee Physics Laboratory and Research Center for Theoretical Physics and 5 Department of Macromolecular Science, The Key Laboratory of Molecular Engineering of Polymers, Fudan University, Shanghai 200433, China
Reprint requests to: Yaoqi Zhou, Howard Hughes Medical Institute Center for Single Molecule Biophysics and Department of Physiology and Biophysics, State University of New York at Buffalo, 124 Sherman Hall, Buffalo, NY 14214, USA; e-mail: yqzhou{at}buffalo.edu; fax: (716) 829-2344.
(RECEIVED December 27, 2004; FINAL REVISION April 13, 2005; ACCEPTED April 22, 2005)
| Abstract |
|---|
|
|
|---|
-spectrin SH3 domain protein (PDB ID 1SHG
[PDB]
). We show that a model with all-atomic detail provides a significantly more accurate prediction of flexibility of residues in proteins than does a coarse-grained residue-level model. The accuracy of flexibility prediction is further confirmed by application of the method to 18 additional proteins with the largest size of 224 residues. Keywords: protein flexibility; mean-field statistical theory; protein thermodynamics; all-atom model
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.041311005.
| Introduction |
|---|
|
|
|---|
Several methods have been developed for an efficient flexibility prediction. The efficient prediction is accomplished by simplification of the model for proteins, the atomic interactions for proteins, or both. Examples are Gaussian and anisotropic network models (GNM and ANM) (Bahar et al. 1997; Doruker et al. 2000; Atilgan et al. 2001; Micheletti et al. 2004), a graph theory (Jacobs et al. 2001), and a statistical mean-field theory (Micheletti et al. 2001; Canino et al. 2002). GNM and ANM predict flexibility based on normal mode analysis of a simple representation of proteins, whereas the graph theory provides a coarse-grained estimation of flexibility based on connectivity.
This work is based on a recently developed self-consistent, mean-field-like model to study proteins in thermodynamic equilibrium (Micheletti et al. 2001, 2002). The model approximates a protein as a chain consisting of beads located at C
atoms of constituting amino acid residues. There are two types of interactions: harmonic interactions between successive beads and Go-like interactions between nonbonded beads (Taketomi et al. 1975; Ueda et al. 1978). The model has been further generalized to isolated proteins in the presence of an external force field by Shen et al. (2002) and to proteinprotein binding by Canino et al. (2002). The major advantage of the meanfield- like formulation and its subsequent generalization is that it allows for the analytical evaluation of the partition function. Here, we extend this mean-field-like formulation of a C
-based model protein to a method of an all-heavy-atom model. We find that such an extension allows a more accurate prediction of protein flexibility.
| Materials and methods |
|---|
|
|
|---|
-based model (Micheletti et al. 2001) as follows:
![]() | (1) |
where the first term is a harmonic bond potential with the summation over covalently bonded atomic pair of i and j; K is the spring constant; T is the temperature (kB=1);
ij and
0i,j are the distance between atoms i and j and native bond length, respectively; the step function
(x) is 1 if x>0, and 0 otherwise; the element of the contact matrix,
i,j, is 1 if i and j are in contact, and 0 otherwise; xi,j is the contact energy between atoms i and j; and Xi,j=(
ij
0i,j)2R2, with R (the nonbonded interaction range)=3 A¢ª. Here, the nonbonded interaction is a harmonic well suitable for a self-consistent solution (Micheletti et al. 2001, 2002). As in the original model, a Go model (
i,j=1) is used (also see Results and Discussion). A contact is defined if the distance between two atoms in different residues is<6.5 Å. We also studied the dependence of protein flexibility on the cutoff distance (see Results and Discussion).
The Hamiltonian shown in Equation 1 cannot be integrated analytically to calculate the partition function
![]() |
Micheletti et al. (2001) showed that model becomes mathematically tractable if the Heaviside function
(Xi,j) is replaced by its preaveraged value pi,j (=<
(Xi,j)>H). That is,
![]() | (2) |
Physically, pi,j is the equilibrium contact probability of atoms i and j at temperature T. For a covalently bonded atomic pair, pi,j is set to 1.
The partition function, now, can be written as
![]() | (3) |
where
![]() |
and M1 is a NxN matrix (N is the total number of atoms) given by
![]() | (4) |
where I represents the residue index, i and j are the atomic indexes, nbi is the number of atomic bonds for atom i and
i,jbond=1 for a bonded atomic pair, and 0 otherwise. The contact probability can be expressed as an incomplete
function:
![]() | (5) |
where Gi,j=Mi,i+Mj,j2Mi,j. Here pi,i is set to be 0. This equation can be solved iteratively for pi,j. In the calculation, we set the spring constant K=1/15. Other values can also be used. Results are not sensitive to the value of K as found in Canino et al. (2002). The initial value of pi,j is set to
i,j. In order to achieve a stable convergence of the algorithm, translational invariance of Hamiltonian has to be broken. This was achieved by modifying diagonal elements of the matrix M1i,j as in Canino et al. (2002). The convergence of pi,j to 0.001 occurred within a few steps. Once pi,j is obtained, the partition function and thermodynamic properties such as energy, entropy, and heat capacity of the system can be obtained. The specific heat capacity CV=TdS/dT|V=dE/dT|V can be calculated from the average internal energy
![]() | (6) |
where Nb is the total number of bonds in a given chain. In addition to thermodynamic properties, one can also evaluate average fraction of native contacts and the root mean squared fluctuation (RMSF) of the atoms around their original positions. The RMSF can be obtained from the second moment of the multidimensional Gaussian partition function
![]() | (7) |
and
![]() | (8) |
where MSFi is the mean squared fluctuation of atom i, RMSFI is the root mean squared fluctuation of residue I, and nI is the number of atoms in that residue.
| Results and Discussion |
|---|
|
|
|---|
protein and one all-
protein. They are a 46-residue three-helix bundle protein fragment B of Staphylococcal protein A (Protein Data Bank [PDB] ID 1BDD
[PDB]
) and a 56-residue
-spectrin SH3 domain protein (PDB ID 1SHG
[PDB]
), respectively.
Figure 1
compares the specific heat capacity CV given by the coarse-grained residue-level (C
only) model and that by the all-atom model for fragment B of protein A. The peak height and area for the folding transition in the all-atom model are significantly higher (or larger) than that in the residue-based model. The folding transition temperature for the all-atom model is also much higher than that for the residue-based model. The similar feature is observed for the
-spectrin SH3 domain protein in Figure 2
as well. This result is in part due to significantly more interactions in the atomic model than in the residue-level model. It is also consistent with the finding that an all-atom model with specific packing yields a stronger transition than does a residue-based model (Zhou and Linhananta 2002). However, it is difficult to assess which model yields a more accurate CV curve because the peaks in both curves are very broad. In contrast, a typical heat-capacity curve of proteins is much narrower, as a result of a first-order-like, cooperative folding transition (Privalov 1979; Zhou et al. 1999; Kaya and Chan 2003).
|
|
|
|
-spectrin SH3 domain protein at T=5 is shown in Figure 4
|
i,j from Miyazawa-Jernigan (MJ) parameter set (Miyazawa and Jernigan 1985) and from Canino et al. (2002) (the latter was from Dasgupta et al. 1997). This is done by applying the residue-based parameter to all atoms in that residue. The correlation coefficient at T=5 for 1SHG is 0.79 between the results from the all-atom MFM with the MJ parameter set and the results from experiments, and 0.78 between the results from the all-atom MFM with the Dasgupta parameter set and the results from experiments. Thus, there is no obvious improvement from the use of residue- based energy parameters in the all-atom MFM. There is also no improvement at the residue level. The correlation coefficient at the residue-level model is 0.60 for the Go model and 0.58 for both the MJ and Dasgupta parameter sets. We also used the statistical atomic contact energy obtained by McConkey et al. (2003) for
i,j. However, we find that the correlation between predicted and experimental RMSF values becomes significantly worse. For example, the correlation coefficient is reduced from 0.79 to 0.60 for 1SHG at T=5. Clearly, there is a need to search for a different parameter set in order to further improve the accuracy of predicted flexibility by the all-atom MFM developed here.
The results reported above are only for two small-size proteins. We further test the all-atom mean-field theory for additional six all-
, three all-
, and three mixed
,
proteins. They were selected based on their relatively small sizes plus a few medium sizes. The protein PDB identifications, the sizes of proteins, and the experimental methods are listed in Table 2
. In addition, this table shows the correlation coefficients between theoretically predicted (both residue-level and all-atom-level models) and experimentally measured RMSF values (based on temperature B-factors or fluctuation data from NMR experiments deposited in the PDB) along with their dependence on the cutoff distance that defined the native contact.
|
A more detailed examination of Table 2
indicates that a large value of contact cutoff is required for a more accurate prediction of flexibility by the all-atom MFM, in general. In fact, if a cutoff distance of 6.5 Å is used, the all-atom model provides a more accurate prediction than does the residue-based model only in five out of 14 proteins (based on the correlation coefficients). Only at a larger cutoff value (e.g., 10.5 Å or 14.5 Å), the all-atom model becomes more accurate in most cases (11 out of 14 cases at 10.5 Å and 14.5 Å). It is not entirely clear why a longer cutoff distance is required for a more accurate prediction of flexibility in the all-atom model. A similar situation was observed in the anisotropic network model (Atilgan et al. 2001), where it was found that a large cutoff value (1215 Å) is required in order to remove certain unphysical behavior of the model. One possibility is that one may have to go beyond the first coordination shell around a residue (about 6.5 Å) (Bahar et al. 1997) for a better estimate of the interactions in proteins as a result of long-range electrostatic interactions. On the other hand, the large cutoff value may be the result of compensation for the crude approximation of the atomic interactions in the all-atom MFM.
While flexibilities for majority of proteins studied here are predicted in a reasonable accuracy, there are no significant correlations at any cutoff distances for two proteins (1CF7
[PDB]
and 1DIV) either by residue-based or all-atom MFMs. The two proteins happen to be the lowest resolution (2.6 Å) proteins among the nine proteins whose structures are solved by the X-ray crystallographic method. A close examination of Table 2
further indicates the trend of a lower correlation coefficient accompanied with a lower resolution.
To minimize the effect of structural inaccuracy on flexibility prediction, we tested the all-atom MFM and residue-level MFM on six additional, randomly selected proteins with high resolutions (
1 Å). They are one all-
, one all-
, and four mixed
,
proteins with the number of residues ranging from 63 to 151. The correlation coefficients are shown in Table 3
. The overall result is similar to the one given in Table 2
. That is, the all-atom model provides the best flexibility prediction at large cutoff distances (10.5 Å and 14.5 Å) among the three models in six out of six cases.
|
-spectrin SH3 domain protein at T=5 with a contact cutoff distance of 14.5 Å. A significant correlation between the two sets of data is observed with a correlation coefficient of 0.87.
|
and one all-
proteins. Further application to 18 additional proteins indicates that predicted protein flexibility is reasonably accurate for majority of proteins studied (high-resolution proteins, in particular). Thus, an efficient and accurate prediction of protein flexibility is possible based on known protein structures and the positions of backbone and side-chains are both important for an accurate prediction.
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
Bahar, I., Atilgan, A.R., and Erman, B. 1997. Direct evaluation of thermal fluctuations in proteins using a single parameter harmonic potential. Fold. Des. 2: 173181.[CrossRef][Medline]
Brooks, B.R., Bruccoleri, R.E., Olafson, B.D., States, D.J., Swaminathan, S., and Karplus, M. 1983. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4: 187217.
Brooks III, C.L., Karplus, M., and Pettitt, B.M. 1988. Proteins: A theoretical perspective of dynamics, structure, and thermodynamics. John Wiley & Sons, New York.
Canino, L.S., Shen, T.Y., and McCammon, J.A. 2002. Changes in flexibility upon binding: Application of the self-consistent pair contact probability method to proteinprotein interactions. J. Chem. Phys. 117: 99279933.[CrossRef]
Dasgupta, S., Iyer, G.H., Lawrence, S.H., and Bell, J.A. 1997. Extent and nature of contacts between protein molecules in crystal lattices and between subunits of protein oligomers. Proteins 28: 494514.[CrossRef][Medline]
Doruker, P., Atilgan, A.R., and Bahar, I. 2000. Dynamics of proteins predicted by molecular dynamics simulations and analytical approaches: Application to
-amylase inhibitor. Proteins 40: 512524.[CrossRef][Medline]
Jacobs, D.J., Rader, A.J., Kuhn, L.A., and Thorpe, M.F. 2001. Protein flexibility prediction using graph theory. Proteins 44: 150165.[CrossRef][Medline]
Kaya, H. and Chan, H. 2003. Simple two-state protein folding kinetics requires near-Levinthal thermodynamic cooperativity. Proteins 52: 510523.[CrossRef][Medline]
McConkey, B.J., Sobolev, V., and Eldman, M. 2003. Discrimination of native protein structures using atomatom contact scoring. Proc. Natl. Acad. Sci. 100: 32153220.
Micheletti, C., Banavar, J.R., and Maritan, A. 2001. Conformations of proteins in equilibrium. Phys. Rev. Lett. 87: 088102.[CrossRef][Medline]
Micheletti, C., Cecconi, F., Flammini, A., and Maritan, A. 2002. Crucial stages of protein folding through a solvable model: Predicting target sites for enzyme-inhibiting drugs. Protein Sci. 11: 18781887.
Micheletti, C., Carloni, P., and Maritan, A. 2004. Accurate and efficient description of protein vibrational dynamics: Comparing molecular dynamics and Gaussian models. Proteins 55: 635645.[CrossRef][Medline]
Miyazawa, S. and Jernigan, R. 1985. Estimation of effective interresidue contact energies from protein crystal structures: Quasi-chemical approximation. Macromole 18: 534552.[CrossRef]
Privalov, P.L. 1979. Stability of proteins: Small globular proteins. Adv. Protein Chem. 33: 167241.[Medline]
Shen, T., Canino, L.S., and McCammon, J.A. 2002. Unfolding proteins under external forces: A solvable model under the self-consistent pair contact probability approximation. Phys. Rev. Lett. 89: 068103.[CrossRef][Medline]
Taketomi, H., Ueda, Y., and Go, N. 1975. Studies on protein folding, unfolding and fluctuations by computer simulations. Int. J. Peptide Protein Res. 7: 445459.[Medline]
Ueda, Y., Taketomi, H., and Go, N. 1978. Studies on protein folding, unfolding and fluctuations by computer simulations, II: A three-dimensional lattice model of lysozyme. Biopolymers 17: 15311548.[CrossRef]
Zhou, Y. and Linhananta, A. 2002. Thermodynamics of an all-atom off-lattice model of the fragment B of staphylococcal protein A: Implication for the origin of the cooperativity of protein folding. J. Phys. Chem. B 106: 14811485.[CrossRef]
Zhou, Y., Hall, C.K., and Karplus, M. 1999. The calorimetric criterion for a two-state process revisited. Protein Sci. 8: 10641074.[Abstract]
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
D. A. Kondrashov, Q. Cui, and G. N. Phillips Jr. Optimization and Evaluation of a Coarse-Grained Model of Protein Motion Using X-Ray Crystal Data Biophys. J., October 15, 2006; 91(8): 2760 - 2767. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sacquin-Mora and R. Lavery Investigating the Local Flexibility of Functional Residues in Hemoproteins Biophys. J., April 15, 2006; 90(8): 2706 - 2717. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |