|
|
||||||||
Howard Hughes Medical Institute (HHMI) Center for Single Molecule Biophysics, Department of Physiology and Biophysics, State University of New York (SUNY) at Buffalo, Buffalo, New York 14214, USA
Reprint requests to: Yaoqi Zhou, Howard Hughes Medical Institute Center for Single Molecule Biophysics, SUNY Buffalo, 124 Sherman Hall, Buffalo, NY 14214, USA; e-mail: yqzhou{at}buffalo.edu; fax: (716) 829-2344.
| Abstract |
|---|
|
|
|---|
positions as the interaction centers recognizes 123 native structures out of a comprehensive 125-protein TOUCHSTONE decoy set in which each protein has 24,000 decoys with only C
positions. Furthermore, the performance by DFIRE-SCM on newly established 25 monomeric and 31 docking Rosetta-decoy sets is comparable to (or better than in the case of monomeric decoy sets) that of a recently developed, all-atom Rosetta energy function enhanced with an orientation-dependent hydrogen bonding potential. Keywords: Knowledge-based potential; decoy sets; ideal-gas reference state; residue-level potential
1 These two authors contributed equally to this work. ![]()
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.03348304.
| Introduction |
|---|
|
|
|---|
A residue-specific, all-atom, distance-dependent potential of mean-force was recently extracted from the structures of single-chain proteins by using a physical state of uniformly distributed points in finite spheres (distance-scaled, finite, ideal-gas reference [DFIRE] state) as the zero-interaction reference state (Zhou and Zhou 2002). Remarkably, the physical reference state yields a potential of mean-force that no longer possesses some unphysical characteristics associated with other statistical potentials. It was shown that the accuracy of DFIRE-based potential is insensitive to the partitioning of hydrophobic and hydrophilic residues within a protein (Zhou and Zhou 2002). More importantly, the new structure-derived potential can quantitatively reproduce the likelihood of a residue to be buried (i.e., the composition difference of amino-acid residues between core and surface; Zhou and Zhou 2004). The potential also produces a stability scale of amino-acid residues in quantitative agreement with that independently extracted from mutation experimental data (Zhou and Zhou 2004). Moreover, the "monomer" potential (derived from single-chain proteins) is found to be equally successful in discriminating against docking decoys, distinguishing true dimeric interface from crystal interfaces, and predicting binding free energy of protein-protein and protein-peptide complexes (Liu et al. 2004). The independence of the performance on amino-acid compositions suggests that the DFIRE-based potential captures the essence of the common physical interaction masked under different compositions of amino-acid residues on the surface, at the core, and at the interface of proteins.
The DFIRE-based potential was an all-atom potential. An initial study of the potential at the level of C
atoms plus backbone atoms indicated that the accuracy of the potential reduces somewhat but remains reasonable (Zhou and Zhou 2002). In the present study, we further reduced the number of atoms for representing a residue to a single united center such as C
(Melo et al. 2002), C
(Hendlich et al. 1990), or side-chain center of mass (SCM, geometry; Bryant and Lawrence 1993; Kocher et al. 1994; Thomas and Dill 1996; Zhang and Kim 2000). The united-residue potential of mean force was tested by the multiple decoy sets of single-chain proteins as well as by docking decoys. We show that the DFIRE-SCM potential of mean force is even more successful than the all-atom potentials of mean force based on statistically average reference states (RAPDF; Samudrala and Moult 1998) and atomic Knowledge-Based Potential (KBP; Lu and Skolnick 2001) in recognizing native structures from 96 multiple decoy sets and 21 docking decoy sets. It is also more successful than a sophisticated semiphysical energy function enhanced with hydrogen-bonding interactions (Kortemme-Morozov-Baker [KMB] potential; Kortemme et al. 2003) in structure discrimination using a new Rosetta monomeric decoy set, and it is comparably successful in the Rosetta docking decoy set. Results suggest that the DFIRE-SCM potential is one of the most accurate coarse-grained potentials that should be useful in assisting structure prediction on a genomic scale (Baker and Sali 2001; Schonbrun et al. 2002; Vajda et al. 2002; Janin and Seraphin 2003).
| Results |
|---|
|
|
|---|
|
, C
, and SCM) yields the most accurate distance-dependent pair potential. The three representations have different characteristics: The C
-C
distance reflects the proximity of backbone atoms, the C
-C
potential is sensitive to the direction of the side chains, and the center of mass, on the other hand, takes into account the average side-chain conformations (Kocher et al. 1994). Figure 1
, C
, to SCM. For each type of interaction center, the DFIRE-based potential outperforms the other two methods by a significant 9% to 22%. More importantly, the average Z-score increases significantly from 3.26 for RAPDF-SCM (or 3.99 for KBP-SCM) to 4.30 for DFIRE-SCM. This suggests that DFIRE-SCM provides a stronger bias against decoys than the other two methods. Because the potential based on SCM performs the best, as found earlier (Kocher et al. 1994), hereafter we shall report the results from SCM-based potentials only, unless indicated otherwise.
|
|
It is of interest to know the loss in accuracy after reducing all-atom representation to single-interaction center. Table 3
compares the performance of DFIRE-SCM with those of all-atom versions of RAPDF, atomic KBP, and DFIRE (see also Fig. 1
). Remarkably, DFIRE-SCM is more accurate than the all-atom version of RAPDF and atomic KBP in native structure selections in all decoy sets except 4state the CASP4 and lattice_ssfit decoy sets. For CASP4 decoy sets, the number of native structures as rank-1 is 19 for DFIRE-SCM and 20 for KBP-all-atom and RAPDF-all-atom. However, the average Z-score from DFIRE-SCM (3.15) is higher than that from either KBP-all-atom (2.93) or RAPDF-all-atom (2.17). For lattice_ssfit, only the average Z-score from DFIRE-SCM (6.19) is lower than that from either KBP-all-atom (6.61) or RAPDF-all-atom (7.18). For all of the 96 multiple decoy sets, however, the success rate of DFIRE-SCM is 10% higher than that of RAPDF-all-atom (or 15% in the case of atomic KBP). The average Z-score given by DFIRE-SCM is also higher than those given by both RAPDF-all-atom and KBP-all-atom. The change in accuracy after reducing all-atom representation to single-interaction center for DFIRE is small except for lmds decoy sets, where the number of rank-1 native structures is seven for the all-atom DFIRE potential, compared to three for the DFIRE-SCM potential. (The number of native structures in the lmds set within top 5 is also seven for DFIRE-SCM, however.) The overall reduction in success rate based on top 1 ranking for all 96 decoy sets is only 2%. However, both potentials have nearly the same success rate based on the top 5 ranking (~89%). Thus, the abilities of DFIRE-all-atom and DFIRE-SCM to distinguish native structures from decoys are comparable for this 96 decoy sets. We also observed similar behavior for the RAPDF and KBP potentials.
|
Table 4
compares the results of the DFIRE-SCM potential with those of the all-atom knowledge-based Lu-Lu-Skolnick (LLS) potential (Lu et al. 2003). The LLS potential is the KBP trained with interfacial structures of dimers. The performance of the all-atom LLS potential is worse than the DFIRE-SCM potential, although the latter was not trained with any interfacial information. The success rates of the all-atom LLS potential based on top 1 ranking are 10/15 (67%) for dimers and 1/6 (17%) for trimers. The corresponding rates for DFIRE-SCM are 13/15 (87%) and 4/6 (67%), respectively. It is of interest to note that there is also a residue-level LLS potential whose success rates are significantly lower than its all-atom counterpart (20% for dimers and 0% for trimers). The DFIRE-all-atom potential, on the other hand, achieved 100% success rates for both dimers and trimers (Liu et al. 2004).
|
|
Table 5
compares the Z-scores from DFIRE-SCM and those from KMB potential (Kortemme et al. 2003) for both monomer and docking decoy sets. The Z-score of the KMB potential ranges from -1.53 to 8.22 for the monomeric decoy sets and from -1.03 to 14.06 for the docking decoy sets. These strongly fluctuating Z-score values suggest that the KMB potential is suitable for discriminating some proteins but not others. On the other hand, the Z-score of the DFIRE-SCM potential is relatively stable, with a much narrower range. The Z-score range is between 0.48 and 5.21 for the monomeric decoy sets and between -0.45 and 3.36 for docking decoys. The overall success rate for KMB is 22/25 (88%) and 23/31 (74%) for monomeric and docking decoys, respectively. The corresponding numbers for DFIRE-SCM are 24/25 (96%) and 23/31 (74%), respectively. Thus, the DFIRE-SCM is more successful in discriminating against decoys than KMB for monomeric proteins and is comparably successful for docking decoys. This is remarkable considering the fact that the KMB potential is an all-atom potential with sophisticated, weight-optimized, energetic terms for van der Waals, solvation, hydrogen-bond interactions, and rotamer probabilities.
|
positions, only the performances of RAPDF-C
, KBP-C
, and DFIRE-C
are compared in Table 6
is able to identify 123 native structures out of 125 decoy sets (98%). This is in sharp contrast to 79 native structure by RAPDF-C
(63%) and 45 native structures by KBP-C
(36%). The average Z-score given by DFIRE-C
(7.96) is also significantly higher than any of those given by either RAPDF-C
(4.59) or KBP-C
(3.01).
|
|
|
| Discussion |
|---|
|
|
|---|
The ability to successfully select native structures from decoys is the minimum requirement for an energy function. A stricter requirement for an energy function is its ability to discriminate near-native conformations in the absence of the native conformation. Although this stricter requirement is usually reserved for more refined energy functions at an all-atom level, it is of interest to know the performance of DFIRE-SCM in this aspect. One way to characterize the ability of detecting near-native conformations is the near-native Z-score, that is, the score difference between the high-rmsd decoys and the low-rmsd decoys normalized by the score fluctuation of the high-rmsd decoys (Kortemme et al. 2003). A decoy is considered a low-rmsd decoy if it is in the lowest 5% of rmsd distribution (Kortemme et al. 2003). For a separate 23 monomeric Rosetta decoy sets (Kortemme et al. 2003), there are only four proteins whose near-native Z-scores are greater than 1 for the KMB energy function. The corresponding number is four for DFIRE-all-atom and two for DFIRE-SCM. For 31 Rosetta docking decoys, there are 22, 22, and 14 proteins with a Z-score (near-native) > 1 for the KMB, DFIRE-all-atom, and DFIRE-SCM, respectively. Another way to characterize the ability to detect near-native conformations is the correlation between energy score and rmsd when the rmsd is smaller than about 3 Å (Kortemme et al. 2003). In the docking decoy set, the number of proteins whose correlation coefficients are equal to or greater than 0.5 is 18 for KMB, 23 for DFIRE-all-atom, and 15 for DFIRE-SCM. Examples are given in Figs. 4
6
, where the rmsd values of decoys are plotted against their energy scores for some selected monomeric proteins, antibody/antigen, and non-antibody complexes. It is clear that DFIRE-SCM is not as good as DFIRE-all-atom or KMB in detecting near-native conformations, whereas DFIRE-all-atom and KMB have comparable ability in detecting near-native structures based on this Rosetta decoy set. However, as a common practice, the above results were obtained by DFIRE-all-atom and DFIRE-SCM without performing any minimization on either native structures or decoys due to discretization of knowledge-based potentials. We are currently developing techniques for minimization. Preliminary results suggest that minimization can further improve the detection of near-native conformations by DFIRE-all-atom and DFIRE-SCM. The details will be reported elsewhere.
|
|
| Materials and methods |
|---|
|
|
|---|
The DFIRE potential of mean force u(i, j, r) between atom (or residue) types i and j that are distance r apart is given by (Zhou and Zhou 2002):
![]() | (1) |
where
(= 0.0157) is a scaling constant, R is the gas constant, T = 300K,
= 1.61, Nobs(i, j, r) is the number of (i, j) pairs within the distance shell r observed in a given structure database, rcut = 14.5 Å, and
r(
rcut) is the bin width at r(rcut). (
r = 2 Å, for r < 2 Å;
r = 0.5 Å for 2 Å < r <8 Å;
r = 1 Å for 8 Å < r <15 Å.) The exponent
for the distance dependence was obtained from the distance dependence of the pair distribution function for uniformly distributed points in finite spheres (finite ideal-gas reference state). The number of observed atomic (force centroids) pair (i, j) with the distance shell r [Nobs(i, j, r)] was obtained from a structural database of 1011 nonhomologous (< 30% homology) proteins with resolution <2 Å, which was collected by Hobohm et al. (1992), http://chaos.fccc.edu/research/labs/dunbrack/culledpdb.html. The potential u(i, j, r) is set to 2
if Nobs(i, j, r) = 0.
Residue-specific atomic types were used (167 atomic types; Samudrala and Moult 1998; Lu and Skolnick 2001). For a residue-based potential, all atoms in a residue are replaced by a united interaction site located at C
, C
, and SCM, respectively. The numbers of types of force centroids for all three reduced potentials are 20. We use the same equation (1
), same parameters, and same bin procedures to generate DFIRE-C
, DFIRE-C
, and DFIRE-SCM that denote C
-based, C
-based, and SCM residue-level potentials, respectively. This is reasonable because residue-specific atomic types were used in generating the all-atom DFIRE potential.
The RAPDF and KBP potentials
In order to compare the DFIRE-based potentials with the RAPDF (Samudrala and Moult 1998) and KBP (Lu and Skolnick 2001) potentials, we regenerated the two potentials using the same procedures described in their original papers. For RAPDF (Samudrala and Moult 1998), the first bin covers 03.0 Å, and the distance between 3.020 Å is binned every 1 Å. The total number of bins is 18. All 18 bins with a cutoff distance of 20 Å are used for scoring. For KBP (Lu and Skolnick 2001), the distance between 1.5 Å to 14.5 Å, is binned every 1 Å and the last bin is from 14.5 Å to infinite. The total number of bins is 14. The first and second sequence neighbors are excluded whereas backbone atoms are included in counting contacts. When used in scoring, only the bins covering 3.56.5 Å are used. In all cases, contacts between atoms within a single residue are excluded from the counts and scoring. In case of zero pairs, both potentials are set to be 2
kcal/mole. The structural database is the 1011 structures described above for the DFIRE-based potentials rather than the 265 proteins used in RAPDF and 1291 proteins used in atomic KBP in their respective original publications. It was shown that the change of database has little effect on the overall accuracy of the RAPDF and atomic KBP potentials (Zhou and Zhou 2002). For RAPDF and KBP residue-based potentials, we used the force centroids as for DFIRE. We used the same equation, same parameters, and same bin procedures to generate RAPDF-C
(KBP-C
), RAPDF-C
(KBP-C
), and RAPDF-SCM (KBP-SCM) denoting C
-based, C
-based, and SCM residue-level potentials, respectively. No attempts were made to optimize the parameters and/or procedures presented in the original papers for possibly better performance.
Structure selections from decoys
For a given 3-D structures of a protein, the total residueresidue potential of mean force, G, is
![]() | (2) |
where the summation is over all pairs of residues. In structure selections from decoy sets, the total potential G is calculated for each structure, including native state and decoys. The native state is correctly identified if its structure has the lowest value of G. Z-score is defined as
![]() |
where 
denotes the average over all decoy structures of a given native protein, and Gnative is the total residue-residue potential of the native structure. Z-score is a measure of the bias toward the native structure. To characterize the ability of detecting near-native conformations, the near-native Z-score, which is the score difference between the high-rmsd decoys and the low-rmsd decoys normalized by the score fluctuation of the high-rmsd decoys, was used (Kortemme et al. 2003). The near-native Z-score is expressed as (Kortemme et al. 2003)
![]() | (3) |
where
Gdecoy
lo(
Gdecoy
hi) is the average energy score of the low (high)-rmsd decoys, and
hi is the standard deviation of the energy score of the high-rmsd decoys. A decoy is considered a low-rmsd decoy if it is in the lowest 5% of rmsd distribution (Kortemme et al. 2003). The low-rmsd decoys represent the near-native structures.
Structure selections from docking decoys/artificial interfaces
The binding free energy of a dimer AB is obtained as follows:
![]() | (4) |
Because the structures of monomers are approximated as rigid bodies and the residues at the interface contribute most to
Gbind, equation 4
can be further simplified to
![]() | (5) |
where the summation is over any two atoms belonging to an "interacting" residue pair from different chains at the interface. We follow the definition, due to Lu et al. (2003), in which an interacting residue pair is a pair of residues from different chains that have at least one pair of heavy atoms within 4.5 Å of each other. Equation 5
can also be used for complexes with more than two partners. The binding free energy
is calculated for each docking decoy (or artificial interface). The native state is correctly identified if 
is the lowest value among all
values (the first rank). A Z-score is defined as
![]() |
where 
denotes the average over all decoy structures of a given protein. The Z-score is a measure of the free-energy bias toward the native complex structure. For docking decoys, we used the same definition of near-native Z-score to evaluate the ability of recognizing near-native structures for a potential, except that the energy for monomer decoys is replaced by binding free energy
.
|
| Acknowledgments |
|---|
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| References |
|---|
|
|
|---|
Baker, D. and Sali, A. 2001. Protein structure prediction and structural genomics. Science 294: 9396.
Betancourt, M.R. and Thirumalai, D. 1999. Pair potentials for protein folding: Choice of reference states and sensitivity of predicted native states to variations in the interaction schemes. Protein Sci. 8: 361369.[Abstract]
Bonneau, R., Strauss, C., Rohl, C., Chivian, D., Bradley, P., Malmstrom, L., Robertson, T., and Baker, D. 2002. De novo prediction of three-dimensional structures for major protein families. J. Mol. Biol. 322: 6578.[CrossRef][Medline]
Bowie, J.U. and Eisenberg, D. 1994. An evolutionary approach to folding small
-helical proteins that uses sequence information and an empirical guiding fitness function. Proc. Natl. Acad. Sci. 91: 44364440.
Bryant, S.H. and Lawrence, C.E. 1993. An empirical energy function for threading protein sequence through the folding motif. Proteins 16: 92112.[CrossRef][Medline]
Chhajer, M. and Crippen, G.M. 2002. A protein folding potential that places the native states of a large number of proteins near a local minimum. BMC Struct. Biol. 2: 4.[CrossRef][Medline]
Colovos, C. and Yeates, T.O. 1993. Verification of protein structures: Patterns of nonbonded atomic interaction. Protein Sci. 2: 15111519.[Abstract]
Dill, K.A. and Chan, H.S. 1997. From Levinthal to pathways to funnels. Nat. Struct. Biol. 4: 1019.[CrossRef][Medline]
Dobson, C.M., Sali, A., and Karplus, M. 1998. Protein folding: A perspective from theory and experiment. Angew. Chem. Int. Ed. Engl. 198: 868893.[CrossRef]
Eisenberg, D., Lüthy, R., and Bowie, J.U. 1997. VERIFY3D: Assessment of protein models with three-dimensional profile. Methods Enzymol. 277: 396404.[Medline]
Elock, A. and McCammon, J. 2001. Identification of protein oligomerization states by analysis of interface conservation. Proc. Nat. Acad. Sci. 98: 29902994.
Eyrich, V.A., Standley, D.M., and Friesner, R.A. 1999. Prediction of protein tertiary to low resolution: Performance for a large and structurally diverse test set. J. Mol. Biol. 288: 725742.[CrossRef][Medline]
Feig, M. and Brooks III, C.L. 2002. Evaluating CASP4 predictions with physical energy functions. Proteins 49: 232245.[CrossRef][Medline]
Glaser, F., Sternberg, D., Vakser, I., and Ben-Tal, N. 2001. Residue frequencies and pairing preferences at proteinprotein interfaces. Proteins 43: 89102.[CrossRef][Medline]
Godzik, A., Kolinski, A., and Skolnick, J. 1995. Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets. Protein Sci. 4: 21072117.[Abstract]
Gray, J., Moughon, S., Kortemme, T., Furman, O., Misura, K., Morozov, A., and Baker, D. 2003. Proteinprotein docking predictions for the CAPRI experiment. Proteins 52: 118122.[CrossRef][Medline]
Hendlich, M., Lackner, P., Weitckus, S., Floeckner, H., Froschauer, R., Gottsbacher, K., Casari, G., and Sippl, M.J. 1990. Identification of native protein folds amongst a large number of incorrect models. The calculation of low energy conformations from potentials of mean force. J. Mol. Biol. 216: 167180.[CrossRef][Medline]
Hinds, D. and Levitt, M. 1992. A lattice model for protein structure prediction at low resolution. Proc. Nat. Acad. Sci. 89: 25362540.
Hobohm, U., Scharf, M., Schneider, R., and Sander, C. 1992. Selection of representative protein data sets. Protein Sci. 1: 409417.[Abstract]
Honig, B. 1999. Protein folding: From the Levinthal paradox to structure prediction. J. Mol. Biol. 293: 283293.[CrossRef][Medline]
Janin, J. and Seraphin, B. 2003. Genome-wide studies of proteinprotein interaction. Curr. Opin. Struct. Biol. 13: 383388.[CrossRef][Medline]
Jones, D.T., Taylor, W.R., and Thornton, J.M. 1992. A new approach to protein fold recognition. Nature 358: 8689.[CrossRef][Medline]
Keasar, C. and Levitt, M. 2003. A novel approach to decoy set generation: Designing a physical energy function having local minima with native structure characteristics. J. Mol. Biol. 329: 159174.[CrossRef][Medline]
Kihara, D., Lu, H., Kolinski, A., and Skolnick, J. 2001. TOUCHSTONE: An ab initio protein structure prediction method that uses threading-based tertiary restraints. Proc. Nat. Acad. Sci. 98: 1012510130.
Kihara, D., Zhang, Y., Lu, H., Kolinski, A., and Skolnick, J. 2002. Ab initio protein structure prediction on a genomic scale: Application to the mycoplasma genitalium genome. Proc. Nat. Acad. Sci. 99: 59935998.
Kocher, J.-P.A., Rooman, M., and Wodak, S. 1994. Factors influencing the ability of knowledge-based potentials to identify native sequence-structure matches. J. Mol. Biol. 235: 15981613.[CrossRef][Medline]
Kortemme, T., Morozov, A., and Baker, D. 2003. An orientation-dependent hydrogen bonding potential improves prediction of specificity and structure for proteins and proteinprotein complexes. J. Mol. Biol. 326: 12391259.[CrossRef][Medline]
Lazaridis, T. and Karplus, M. 2000. Effective energy function for protein structure prediction. Curr. Opin. Struct. Biol. 10: 139145.[CrossRef][Medline]
Levitt, M. 1976. A simplified representation of protein conformations for rapid simulation of protein folding. J. Mol. Biol. 104: 59107.[CrossRef][Medline]
Li, X., Hu, C., and Liang, J. 2003. Simplicial edge representation of protein structures and
contact potential with confidence measure. Proteins 53: 792805.[CrossRef][Medline]
Liu, S., Zhang, C., Zhou, H., and Zhou, Y. 2004. A physical reference state unifies the structure-derived potential of mean force for protein folding and binding. Proteins (in press)
Lu, H. and Skolnick, J. 2001. A distance-dependent atomic knowledge-based potential for improved protein structure selection. Proteins 44: 223232.[CrossRef][Medline]
Lu, H., Lu, L., and Skolnick, J. 2003. Development of unified statistical potentials describing proteinprotein interactions. Biophys. J. 84: 18951901.
McConkey, B.J., Sobolev, V., and Edelman, M. 2002. Discrimination of native protein structures using atomatom contact scoring. Proc. Natl. Acad. Sci. 100: 32153220.
Melo, F., Sanchez, R., and Sali, A. 2002. Statistical potentials for fold assessment. Protein Sci. 430: 430448.
Mintseris, J. and Weng, Z. 2003. Atomic contact vectors in proteinprotein recognition. Proteins 53: 629639.[CrossRef][Medline]
Miyazawa, S. and Jernigan, R.L. 1985. Estimation of effective interresidue contact energies from protein crystal structures: Quasi-chemical approximation. Macromolecules 18: 534552.[CrossRef]
. 1999. An empirical energy potential with a reference state for protein fold and sequence recognition. Proteins 36: 357369.[CrossRef][Medline]
Moont, G., Gabb, H., and Sternberg, M. 1999. Use of pair potentials across protein interfaces in screening predicted docked complexes. Proteins 35: 364373.[CrossRef][Medline]
Nanias, M., Chinchio, M., Pillardy, J., Ripoll, D., and Scheraga, H. 2003. Packing helices in proteins by global optimization of a potential energy function. Proc. Nat. Acad. Sci. 100: 17061710.
Ofran, Y. and Rost, B. 2003. Analyzing six types of proteinprotein complexes. J. Mol. Biol. 325:377387.[CrossRef][Medline]
Panchenko, A.R., Marchler-Bauer, A., and Bryant, S.H. 2000. Combination of threading potentials and sequence profiles improves fold recognition. J. Mol. Biol. 296: 13191331.[CrossRef][Medline]
Park, B. and Levitt, M. 1996. Energy functions that discriminate X-ray and near native folds from well-constructed decoys. J. Mol. Biol. 258: 367392.[CrossRef][Medline]
Pillardy, J., Czaplewski, C., Liwo, A., Lee, J., Ripoll, D.R., Kamierkiewicz, R., Oldziej, S., Wedemeyer, W.J., Gibson, K.D., Arnautova, Y.A., et al. 2001. Recent improvements in prediction of protein structure by global optimization of a potential energy function. Proc. Natl. Acad. Sci. 98: 23292333.
Ponstingl, H., Henrick, K., and Thornton, J. 2000. Discriminating between homodimeric and monomeric proteins in the crystalline state. Proteins 41: 4757.[CrossRef][Medline]
Samudrala, R. and Moult, J. 1998. An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction. J. Mol. Biol. 275: 895916.[CrossRef][Medline]
Samudrala, R., Xia, Y., Levitt, M., and Huang, E. 1999. A combined approach for ab initio construction of low resolution protein tertiary structures from sequence. Pac. Symp. Biocomput. 4: 505506.
Schonbrun, J., Wedemeyer, W., and Baker, D. 2002. Protein structure prediction in 2002. Curr. Opin. Struct. Biol. 12: 348354.[CrossRef][Medline]
Simons, K.T., Kooperberg, C., Huang, E., and Baker, D. 1997. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268: 209225.[CrossRef][Medline]
Simons, K., Bonneau, R., Ruczinski, I., and Baker, D. 1999. Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins 37(S3): 171176.[CrossRef][Medline]
Simons, K., Strauss, C., and Baker, D. 2001. Prospects for ab initio protein structural genomics. J. Mol. Biol. 306: 11911199.[CrossRef][Medline]
Sippl, M.J. 1990. Calculation of conformational ensembles from potentials of mean force. An approach to the knowledge-based prediction of local structures in globular proteins. J. Mol. Biol. 213: 859883.[Medline]
. 1993. Recognition of errors in three-dimensional structures of proteins. Proteins 17: 355362.[CrossRef][Medline]
Tanaka, S. and Scheraga, H.A. 1976. Medium- and long-range interaction parameters between amino acids for predicting three-dimensional structures of proteins. Macromolecules 9: 945950.[CrossRef][Medline]
Thomas, P.D. and Dill, K.A. 1996. Statistical potentials extracted from protein structures: How accurate are they? J. Mol. Biol. 257: 457469.[CrossRef][Medline]
Tobi, D. and Elber, R. 2000. Distance-dependent, pair potential for protein folding: Results from linear optimization. Proteins 41: 4046.[CrossRef][Medline]
Tsai, J., Bonneau, R., Morozov, A.V., Kuhlman, B., Rohl, C.A., and Baker, D. 2003. An improved protein decoy set for testing energy functions for protein structure prediction. Proteins 53: 7687.[CrossRef][Medline]
Vajda, S., Vakser, I., Sternberg, M., and Janin, J. 2002. Modeling of protein interactions in genomes. Proteins 47: 444446.[CrossRef][Medline]
Vendruscolo, M., Mirny, L., Shakhnoich, E.I., and Domany, E. 2000. Comparison of two optimization methods to derive energy parameters for protein folding: Perception and Z score. Proteins 41: 192201.[CrossRef][Medline]
Vijayakumar, M. and Zhou, H.-X. 2000. Prediction of residueresidue pair frequencies in proteins. J. Phys. Chem. B 104: 97559764.[CrossRef]
Xia, Y., Huang, E.S., Levitt, M., and Samudrala, R. 2000. Ab initio construction of protein tertiary structures using a hierarchical approach. J. Mol. Biol. 300: 171185.[CrossRef][Medline]
Zacharias, M. 2003. Proteinprotein docking with a reduced protein model accounting for side-chain flexibility. Protein Sci. 12: 12711282.
Zhang, C. and Kim, S. 2000. Environment-dependent residue contact energies for proteins. Proc. Natl. Acad. Sci. 97: 25502555.
Zhang, Y., Kolinski, A., and Skolnick, J. 2003. TOUCHSTONE II: A new approach to ab initio structure prediction. Biophys. J. 85: 11451164.
Zhou, H. and Zhou, Y. 2002. Distance-scaled, finite ideal-gas reference state improves structure-derived potentials of mean force for structure selection and stability prediction. Protein Sci. 11: 27142726. Corrections 12: 2121 (2003).
. 2004. Quantifying the effect of burial of amino acid residues on protein stability. Proteins (in press)