|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Biology, 2 Center for Cancer Research, 3 Computer Science and Artificial Intelligence Laboratory, 4 Biological Engineering Division, and 5 Department of Electrical Engineering & Computer Science, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
Reprint requests to: Michael B. Yaffe, MIT Center for Cancer Research, Room E18580, Cambridge, MA 02139, USA; e-mail: myaffe{at}mit.edu; fax: (617) 452-4978; or Bruce Tidor, MIT Computer Science and Artificial Intelligence Laboratory, Room 32-212, Cambridge, MA 02139, USA; e-mail: tidor{at}mit.edu; fax: (617) 252-1816.
(RECEIVED July 1, 2004; FINAL REVISION September 7, 2004; ACCEPTED September 7, 2004)
| Abstract |
|---|
|
|
|---|
Keywords: phosphopeptide-binding domains; BRCA1; Chk1; functional site prediction
Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.04964705.
| Introduction |
|---|
|
|
|---|
The structures of eight distinct classes of phosphopeptide-binding modules in complex with phosphorylated peptides or proteins have been solved (WW, PTB, SH2, MH2, FHA, WD40, Polo-box, 14-3-3). An examination of these structures reveals little structural similarity among the phosphopeptide-binding sites, apart from the evolutionary conservation seen among members of the same domain family (Yaffe and Smerdon 2001) (see Supplemental Material). We reasoned that, despite the lack of gross structural similarity, there should be some underlying chemical and physical characteristics that define the phosphopeptide-interacting surface. We therefore analyzed a representative collection of these domains in detail and evaluated a set of physical and chemical properties at discrete points along their molecular surfaces. These properties were used to calculate a propensity value for each property to occur within a phosphoresidue contact site.
We found that these propensity values were able to correctly identify the phosphoresidue contact site on phosphopeptide-binding domains for which the site was known, in a cross-validation procedure. We used these propensities to predict the location of phosphopeptide-binding sites on the surface of two domains for which there was no published phosphopeptide cocrystal structure; the BRCT-repeat domain of the protein BRCA1, and the kinase domain of the checkpoint protein Chk1. BRCA1 is a tumor-suppressing protein whose dysfunction predisposes women to breast and ovarian cancer. The BRCT-repeat domains of BRCA1 and several other proteins were recently shown to bind phosphopeptides as part of the DNA damage response (Manke et al. 2003; Yu et al. 2003). The checkpoint kinase Chk1 plays a critical role in the cell cycle response to DNA damage, and appears to be regulated by binding to phosphopeptides at a site distinct from that of its catalytic activity (Jeong et al. 2003). The resulting predictions are corroborated with experimental data identifying the sites of phosphopeptide interaction. We anticipate that this computational approach to identifying phosphopeptide-binding sites will find general utility in the functional annotation of the structural genome, in the characterization of the structure and function of new phosphopeptide-binding domains as they are discovered, and in the identification of sites to target with inhibitors of protein/phosphopeptide interaction.
| Results |
|---|
|
|
|---|
|
|
Surface curvature
A measure of the mean local curvature about each surface point was calculated (Meyer et al. 2003), and used to produce a propensity value related to surface curvature. There is a spike in the overall distribution of surface curvatures at approximately 0.3 Å1, corresponding to the local concavity at any location where the 3 Å probe used to derive the molecular surface contacted three or more protein atoms (Fig. 1B
, upper panel). There is also a small shoulder in the distribution centered at a convex curvature of 0.5 Å1, corresponding to regions where the probe touches only a single atom. The remainder of the distribution corresponds to saddle regions on the protein surface where the probe touches two atoms, and the surface has both concave and convex character.
Qualitatively, the distribution of surface points that bind to a phosphorylated side chain appears quite similar to the global distribution (Fig. 1B
, middle panel). Quantitatively, however, the propensity for phosphoresidue contact, obtained by dividing the phosphoresidue contact site frequency distribution by the overall frequency distribution, is enriched in two regions (Fig. 1B
, lower panel). One of these regions, with relatively high negative curvature values, is the ratio of sparsely populated regions of the contact site and global frequency distributions (Fig. 1B
, upper and middle panels), making the predictive validity of propensities in this region questionable. The second region of high propensity lies between curvature values of 0.1 and 0.6 Å1 (Fig. 1B
, lower panel), and corresponds to regions of concavity in the protein surface that are highly populated in the global distribution. The data in this region quantifies the well accepted tendency of ligands to bind to concave regions of protein surface, in the specific context of phosphopeptide-binding domain:ligand interactions.
Electrostatic potential
To examine the effect of electrostatic potential on phosphopeptide binding, we used a continuum electrostatic model to calculate the solvated state potential of each phosphopeptide-binding domain in our data set in the absence of the cognate phosphopeptide ligand. The distribution of potentials on the phosphate-accessible surfaces of all proteins studied was bell shaped, and centered approximately at zero (Fig. 1C
, upper panel). As expected, the distribution of electrostatic potentials for the subset of the domain surfaces that contact a phosphorylated side chain is significantly shifted toward positive values (Fig. 1C
, middle panel). As a result, the propensity distribution over electrostatic potentials, calculated as the distribution of electrostatic potentials in phosphoresidue contact sites divided by the global distribution of electrostatic potentials, peaks in the range between +7 and +9 kT/e.
As might be expected, the propensity for binding to phosphorylated side chains trails off as the electrostatic potential at a surface point becomes more negative from this peak, falling to almost zero at neutral electrostatic potential. Interestingly, the propensity also falls off for surface points having the highest electrostatic potential. The implication, then, is that surface points with such high positive electrostatic potentials are not as well suited for binding phosphopeptides as points with more moderate potentials, despite the high negative charge of a phosphorylated amino acid side chain. This is likely due to the high energetic cost of desolvating a region of such extreme positive potential (Lee and Tidor 1997).
Predictive ability for known phosphoresidue contact sites
To determine whether the calculated propensities were unduly influenced by any single structure in the data set, a cross-validation procedure was used ("jack-knifing") in which each structure was individually removed, and the propensities recalculated. The nine resulting sets of propensities were quite similar (shown by error bars in Fig. 1A
C, lower panel), with individual propensity values in well populated regions of the distributions differing on average from those calculated for the full data set by less than 10% in the case of surface curvatures and electrostatic potentials, and by less than 25% for amino acid identities.
Of the three independent propensities calculated for amino acid identity, surface curvature, and electrostatic potential, none was sufficient on its own to unambiguously identify the site of known phosphoresidue contact on the set of phosphopeptide-binding domains studied here (Fig. 2
, left panels). However, the scales of propensities encountered in this analysis provide a framework for understanding the contribution of each characteristic studied to phosphoresidue binding. The scales of propensity values encountered indicate the most favorable values of electrostatic potential are more predictive, with respect to phosphoresidue contact, than the most favorable values of amino acid identity or surface curvature. Nevertheless, unfavorable propensity values contributed by amino acid identity or surface curvature are capable of countering false-positive favorable contributions from positive electrostatic potential in order to improve the accuracy of our predictions, as shown in Figure 2
.
|
, the Smad MH2 domain, and both FHA and both SH2 domains studied, only a single site of significant size and propensity was observed. However, for Pin1, Cdc4, and the Polo-box domain of Plk1, a second site of comparable size and propensity to a known real phosphopeptide-binding site was also observed (Fig. 4
|
|
Application of local surface propensity analysis to the Chk1 kinase domain surface identified two possible sites for phosphopeptide binding (Fig. 5A
). These sites are connected by a small region of neutral propensity. The first site, located at the interface between the large and the small lobes of the kinase, but not in the kinase catalytic site, is made up of the amino acid side chains K54, R129, T153, R162, and N165 (Fig. 5A
, rightmost indicated site). The mutations K54A, R129A, and T153A, and R162A have all been shown to abrogate claspin binding in the frog Chk1 homolog Xchk1 (Jeong et al. 2003). Our results suggest that those residues are directly responsible for phosphoclaspin binding. The second site we identified, on the small lobe of the kinase domain, is adjacent to the first, and is made up of the Chk1 amino acid side chains K53, K60, H73 and R75 (Fig. 5A
, leftmost indicated site). While this site has not previously been identified as a site of phosphopeptide binding, it is known that phosphoclaspin binding to Xchk1 requires two separate claspin phosphorylation events, on residues S864 and S895. It is possible, therefore, that the two phosphopeptide residues pS864 and pS895, separated by 31 amino acids, are recognized by two distinct phosphopeptide-binding sites on the Chk1 surface.
|
During the preparation of this manuscript, the crystal structure of the human BRCA1 BRCT domain in complex with a phosphopeptide was solved (Clapperton et al. 2004; Shiozaki et al. 2004; Williams et al. 2004). In this structure, the phosphoresidue contact site was shown to correspond to the first of the two sites on the BRCA1 surface predicted by our method, indicating that for this site at least, our prediction was correct. This result, together with the experimentally corroborated prediction on the surface of the Chk1 kinase domain, indicates that the methodology described here has captured a large portion of the chemical and physical nature of phosphopeptide binding in a manner that is useful for predicting binding sites.
The phosphoresidue contact site predictions described here were originally made by visual inspection of the joint phosphoresidue contact potential on the surfaces of Chk1 and BRCA1 and selection of the largest site of favorable propensity. We are currently exploring a vertex clustering algorithm designed to identify large regions of favorable propensity in an automated fashion.
| Discussion |
|---|
|
|
|---|
There are three important caveats to the computational method: First, we assume the independence of propensities calculated from a set of propertiesamino acid identity, mean surface curvature, and electrostatic potentialwhich are not themselves independent. Given a large volume of data, it is possible to abandon this approximation by calculating an exact propensity value for every possible combination of property values. As more data become available, it should be possible to learn correct parameters for the combination of these propensity values.
Sites with the highest propensities for phosphoresidue contact have strong favorable propensity contributions from each of the three properties considered here. In the limit of currently available data, we find that all three properties considered here are necessary for accurate site prediction. Although strong favorable propensity for phosphoresidue contact is driven by the solvated electrostatic potential, false positive predictions that would be generated by the consideration of electrostatics alone are avoided by combining information about surface curvature and amino acid identity.
Second, we calculate and cross-validate propensity values from a set of crystal structures solved in the presence of phosphopeptide. These structures may involve some induced fit to their cognate peptides, whereas structures for which useful predictions can be made would be in their unliganded apo conformation. Despite this, we make predictions for the Chk1 kinase domain and the BRCA1 BRCT-repeat domain that are validated by experiment, indicating that the physical and chemical aspects of a phosphoresidue contact site which are captured by our model are not lost in the apo state.
Finally, the method described here is designed to identify the site of phosphoresidue contact on the surface of a known phosphopeptide-binding domain. It is clear that as novel phosphopeptide-binding domains are discovered, and as structural genomics efforts come to fruition, this approach will prove useful in rapidly identifying the functional sites on unliganded crystal structures without necessitating further crystallographic effort. Because the propensities calculated here are trained to differentiate phosphoresidue contact surface from the remainder of the surface of phosphopeptide-binding domains, this may be less useful in mining structural databases for novel phosphopeptide-binding domains. We expect, based on the emphasis given by our propensity scale to positive electrostatic potential, that this scale might score some anion- and phosphate-binding sites quite favorably. This has been confirmed by our examination of several nonphosphopeptide-binding proteins (data not shown). However, if the goal of future work is to differentiate among different types of anion-binding sites, appropriate propensity scales and other machine learning tools could certainly be developed, for example for the differentiation of phosphoresidue contact sites from such "decoy" sites.
The method described here is highly extendable, both in terms of the type of functional site examined, and in the characteristics for which propensities are calculated. Propensity calculations can be performed on continuous properties such as curvature and electrostatic potential, which have been discretized via binning, as well as on traditional discrete properties such as amino acid identity. Therefore, any property that can be assigned to the vertices of a protein surface can be applied to site predictions within this methodological framework. Moreover, predictions can be made within this framework for any functional categorization for which predictive physical surface properties can be found. Our successes in the identification of phosphoresidue contact sites on the surfaces of the Chk1 kinase domain and the BRCA1 BRCT-repeat domains indicate the utility of this methodology in functional site annotation.
| Materials and methods |
|---|
|
|
|---|
Propensity calculation
For each property associated with a surface pointamino acid identity, surface curvature, and electrostatic potentiala propensity for phosphoresidue contact was calculated. The propensity of a property i was calculated as
![]() |
where nb(i) and nt(i) are the number of surface points with characteristic i contacting phosphoresidues and in total, respectively, and nb and nt are the number of surface points contacting phosphoresidues and total number of surface points in the data set, regardless of characteristic.
When attempting to predict the phosphoresidue contact site on a protein, the propensity assigned to each surface point was computed, under the simplifying assumption that propensities generated using amino acid identity, local mean surface curvature, and solvated electrostatic potential combine noncooperatively, as
![]() |
Figure 2
shows one example of the combination of these three individual propensities to derive a joint propensity.
Surface and contact calculation
The program MSMS (Sanner et al. 1996) was used to obtain a triangular surface mesh for each phosphopeptide-binding domain, using a probe radius of 3.0 Å, the approximate radius of a phosphate ion, and a surface density of 5.0 vertices/Å2. Calculations were performed on a monomer of each phosphopeptide-binding domain in the presence and the absence of only the phosphorylated side chain of the corresponding binding peptide. Surface points contacted by the phosphoresidue were identified as those that were surface accessible on the unliganded protein surface but buried in the protein/phosphoresidue complex surface such that they were further than 0.3 Å from the nearest point on the bound-state surface.
Amino acid identity assignment
The amino acid identity of each surface point was recorded as identified by MSMS, with points on the reentrant phosphate-accessible molecular surface assigned to the nearest atomic van der Waals sphere.
Mean surface curvature assignment
The mean surface curvature at each point was calculated according to the method of Meyer et al. (2003). In order to discretize the space of curvatures for propensity calculation, surface curvatures were binned with a bin width of 0.1 Å1 between the values of 0.6 and 1.4 Å1, with curvatures above and below the extrema placed in the highest and lowest bin, respectively. Calculated propensities were found to be insensitive to the bin size selected over a range of bin sizes from 0.05 Å1 to 0.5 Å1.
Solvated electrostatic potential assignment
The electrostatic potential at each surface point was calculated with a continuum electrostatic model with a locally modified version of the program DELPHI (Gilson et al. 1988; Sharp and Honig 1990a, b). The calculation used the phosphopeptide-binding domain alone, a solvent dielectric of 80, a salt concentration of 0.145 M, a protein dielectric of 4, and PARSE parameters (Sitkoff et al. 1994). Prior to calculating potentials, hydrogen atom positions and the titration and flip states of histidine, glutamine, and asparagine side chains were assigned to the protein structures using the program REDUCE (Word et al. 1999). Electrostatic potentials were discretized for propensity calculation by binning, with bin with 0.5 kT/e, with data below 15 kT/e or above +15 kT/e assigned to the lowest and the highest bin, respectively. Calculated propensities were found to be insensitive to the bin size selected over a range of bin sizes from 0.25 to 5.0 kT/e.
| Electronic supplemental material |
|---|
|
|
|---|
| Footnotes |
|---|
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
Clapperton, J.A., Manke, I.A., Lowery, D.M., Ho, T., Haire, L.F., Yaffe, M.B., and Smerdon, S.J. 2004. Structure and mechanism of BRCA1 BRCT domain recognition of phosphorylated BACH1 with implications for cancer. Nat. Struct. Mol. Biol. 11: 512518.[CrossRef][Medline]
Durocher, D., Taylor, I.A., Sarbassova, D., Haire, L.F., Westcott, S.L., Jackson, S.P., Smerdon, S.J., and Yaffe, M.B. 2000. The molecular basis of FHA domain: Phosphopeptide binding specificity and implications for phospho-dependent signaling mechanisms. Mol. Cell 6: 11691182.[CrossRef][Medline]
Eck, M.J., Shoelson, S.E., and Harrison, S.C. 1993. Recognition of a high-affinity phosphotyrosyl peptide by the Src homology-2 domain of P56(Lck). Nature 362: 8791.[CrossRef][Medline]
Elia, A.E.H., Rellos, P., Haire, L.F., Chao, J.W., Ivins, F.J., Hoepker, K., Mohammad, D., Cantley, L.C., Smerdon, S.J., and Yaffe, M.B. 2003. The molecular basis for phosphodependent substrate targeting and regulation of Plks by the Polo-box domain. Cell 115: 8395.[CrossRef][Medline]
Gilson, M.K., Sharp, K.A., and Honig, B. 1988. Calculating the electrostatic potential of molecules in solution: Method and error assessment. J. Comp. Chem. 9: 327335.[CrossRef]
Jeong, S.Y., Kumagai, A., Lee, J., and Dunphy, W.G. 2003. Phosphorylated claspin interacts with a phosphate-binding site in the kinase domain of Chk1 during ATR-mediated activation. J. Biol. Chem. 278: 4678246788.
Jones, S. and Thornton, J.M. 1997. Prediction of protein-protein interaction sites using patch analysis. J. Mol. Biol. 272: 133143.[CrossRef][Medline]
Jones, S., Shanahan, H.P., Berman, H.M., and Thornton, J.M. 2003. Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins. Nucleic Acids Res. 31: 71897198.
Joo, W.S., Jeffrey, P.D., Cantor, S.B., Finnin, M.S., Livingston, D.M., and Pavletich, N.P. 2002. Structure of the 53BP1 BRCT region bound to p53 and its comparison to the Brca1 BRCT structure. Genes & Dev. 16: 583593.
Kraulis, P.J. 1991. MolscriptA program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24: 946950.[CrossRef]
Lee, L.P. and Tidor, B. 1997. Optimization of electrostatic binding free energy. J. Chem. Phys. 106: 86818690.[CrossRef]
Li, J.J., Williams, B.L., Haire, L.F., Goldberg, M., Walker, E., Durocher, D., Yaffe, M.B., Jackson, S.P., and Smerdon, S.J. 2002. Structural and functional versatility of the FHA domain in DNA-damage signaling by the tumor suppressor kinase Chk2. Mol. Cell 9: 10451054.[CrossRef][Medline]
Lichtarge, O., Yao, H., Kristensen, D.M., Madabushi, S., and Mihalek, I. 2003. Accurate and scalable identification of functional sites by evolutionary tracing. J. Struct. Funct. Genomics 4: 159166.[CrossRef][Medline]
Manke, I.A., Lowery, D.M., Nguyen, A., and Yaffe, M.B. 2003. BRCT repeats as phosphopeptide-binding modules involved in protein targeting. Science 302: 636639.
Merritt, E.A. and Bacon, D.J. 1997. Raster3D: Photorealistic molecular graphics. Methods Enzymol. 277: 505524.[Medline]
Meyer, M., Desbrun, M., Schröder, P., and Barr, A.H. 2003. Discrete differential-geometry operators for triangulated 2-manifolds. In Visualization and mathematics III, pp. 3458. Springer Verlag, Heidelberg, Germany.
Orlicky, S., Tang, X.J., Willems, A., Tyers, M., and Sicheri, F. 2003. Structural basis for phosphodependent substrate selection and orientation by the SCFCdc4 ubiquitin ligase. Cell 112: 243256.[CrossRef][Medline]
Rittinger, K., Budman, J., Xu, J.A., Volinia, S., Cantley, L.C., Smerdon, S.J., Gamblin, S.J., and Yaffe, M.B. 1999. Structural analysis of 14-3-3 phosphopeptide complexes identifies a dual role for the nuclear export signal of 14-3-3 in ligand binding. Mol. Cell 4: 153166.[CrossRef][Medline]
Sanner, M.F., Olson, A.J., and Spehner, J.C. 1996. Reduced surface: An efficient way to compute molecular surfaces. Biopolymers 38: 305320.[CrossRef][Medline]
Sharp, K.A. and Honig, B. 1990a. Calculating total electrostatic energies with the nonlinear Poisson-Boltzmann Equation. J. Phys. Chem. 94: 76847692.[CrossRef]
. 1990b. Electrostatic interactions in macromolecules: Theory and applications. Annu. Rev. Biophys. Biophys. Chem. 19: 301332.[CrossRef][Medline]
Shiozaki, E.N., Gu, L., Yan, N., and Shi, Y. 2004. Structure of the BRCT repeats of BRCA1 bound to a BACH1 phosphopeptide: Implications for signaling. Mol. Cell 14: 405412.[CrossRef][Medline]
Sitkoff, D., Sharp, K.A., and Honig, B. 1994. Accurate calculation of hydration free energies using macroscopic solvent models. J. Phys. Chem. 98: 19781988.[CrossRef]
Verdecia, M.A., Bowman, M.E., Lu, K.P., Hunter, T., and Noel, J.P. 2000. Structural basis for phosphoserine-proline recognition by group IV WW domains. Nat. Struct. Biol. 7: 639643.[CrossRef][Medline]
Waksman, G., Shoelson, S.E., Pant, N., Cowburn, D., and Kuriyan, J. 1993. Binding of a high-affinity phosphotyrosyl peptide to the Src Sh2 domainCrystal-structures of the complexed and peptide-free forms. Cell 72: 779790.[CrossRef][Medline]
Williams, R.S., Lee, M.S., Hau, D.D., and Glover, J.N. 2004. Structural basis of phosphopeptide recognition by the BRCT domain of BRCA1. Nat. Struct. Mol. Biol. 11: 519525.[CrossRef][Medline]
Word, J.M., Lovell, S.C., Richardson, J.S., and Richardson, D.C. 1999. Asparagine and glutamine: Using hydrogen atom contacts in the choice of side-chain amide orientation. J. Mol. Biol. 285: 17351747.[CrossRef][Medline]
Wu, J.W., Hu, M., Chai, J.J., Seoane, J., Huse, M., Li, C., Rigotti, D.J., Kyin, S., Muir, T.W., Fairman, R., et al. 2001. Crystal structure of a phosphorylated Smad2: Recognition of phosphoserine by the MH2 domain and insights on Smad function in TGF-
signaling. Mol. Cell 8: 12771289.[CrossRef][Medline]
Yaffe, M.B. 2002. Phosphotyrosine-binding domains in signal transduction. Nat. Rev. Mol. Cell Biol. 3: 177186.[CrossRef][Medline]
Yaffe, M.B. and Elia, A.E. 2001. Phosphoserine/threonine-binding domains. Curr. Opin. Cell Biol. 13: 131138.[CrossRef][Medline]
Yaffe, M.B. and Smerdon, S.J. 2001. Phosphoserine/threonine binding domains: You cant pSERious? Structure 9: R33R38.[Medline]
Yaffe, M.B., Schutkowski, M., Shen, M.H., Zhou, X.Z., Stukenberg, P.T., Rahfeld, J.U., Xu, J., Kuang, J., Kirschner, M.W., Fischer, G., et al. 1997. Sequence-specific and phosphorylation-dependent proline isomerization: A potential mitotic regulatory mechanism. Science 278: 19571960.
Yu, X.C., Chini, C.C.S., He, M., Mer, G., and Chen, J.J. 2003. The BRCT domain is a phospho-protein binding domain. Science 302: 639642.
Zhou, M.M. 2000. Phosphothreonine recognition comes into focus. Nat. Struct. Biol. 7: 10851087.[CrossRef][Medline]
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
M. B. Yaffe "Bits" and Pieces Sci. Signal., June 20, 2006; 2006(340): pe28 - pe28. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. I. Yarden and M. Z. Papa BRCA1 at the crossroad of multiple cellular pathways: approaches for therapeutic interventions. Mol. Cancer Ther., June 1, 2006; 5(6): 1396 - 1404. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Y. H. Jia, J. Nie, C. Wu, C. Li, and S. S.-C. Li Novel Src Homology 3 Domain-binding Motifs Identified from Proteomic Screen of a Pro-rich Region Mol. Cell. Proteomics, August 1, 2005; 4(8): 1155 - 1166. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |