|
|
||||||||
Department of Chemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
Reprint requests to: Juliette T.J. Lecomte, Department of Chemistry, The Pennsylvania State University, 152 Davey Laboratory, University Park, PA 16802, USA; e-mail: jtl1{at}psu.edu; fax: (814) 863-8403.
(RECEIVED May 15, 2003; FINAL REVISION July 11, 2003; ACCEPTED July 11, 2003)
1 Present addresses: Department of Microbiology and Molecular Genetics, Markey Center for Molecular Genetics, University of Vermont, Burlington, VT 05405, USA; ![]()
2 Xencor, Inc., 111 West Lemon Avenue, Monrovia, CA 91016, USA. ![]()
Supplemental material: See www.proteinscience.org
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.03190903.
| Abstract |
|---|
|
|
|---|
Keywords: Protein structure; protein folding; stability and mutagenesis; protein design; WW domain
Abbreviations: BMRB, BioMagResBank bp, base pair CD, circular dichroism DQF-COSY, double-quantum-filtered correlated spectroscopy DSS, 2,2-dimethyl-2-silapentane-5-sulfonate EDTA, ethylenediaminetetraacetic acid GA, genetic algorithm MALDI, matrix-assisted laser desorption ionization MOPS, 3[N-morpholino] propanesulfonic acid MRE, molar residual ellipticity NIa, nuclear inclusion a NOE, nuclear Overhauser effect NOESY, two-dimensional nuclear Overhauser effect spectroscopy PAGE, polyacrylamide gel electrophoresis PCR, polymerase chain reaction ppm, parts per million r.m.s.d., root-mean-square deviation SDS, sodium dodecylsulfate SPA, sequence prediction algorithm SPANS, sequence prediction algorithm for numerous states TFA, trifluoroacetic acid TOCSY, totally correlated two-dimensional spectroscopy TPPI, time-proportional phase incrementation Tris, tris(hydroxymethyl) aminomethane
| Introduction |
|---|
|
|
|---|
-helical, structures that exhibit a prevalence of short-range interactions. In contrast, ß-sheet proteins depend on long-range H-bonding interactions that are a challenge for computational design (Ottesen and Imperiali 2001; Karanicolas and Brooks III 2003). Early efforts recognized the importance of stabilizing the topology, and resorted to metal ions and disulfide bridges to encode the specificity of the fold (Pessi et al. 1993; Quinn et al. 1994; Yan and Erickson 1994). Work toward the de novo design of simple ß-hairpin peptides (de Alba et al. 1996; Ramirez-Alvarado et al. 1996) and the iterative development of a small ß-sheet protein (Kortemme et al. 1998; Lopez de la Paz et al. 2001) later met with mixed results. We report here the automated design of a ß-sheet protein devoid of ligands and cross-links through the implementation of a sequence prediction algorithm, or SPA (Raha et al. 2000), coupled with a novel sampling procedure that integrates information from an ensemble of backbone structures. This algorithm can be used to select one or more sequences that have a high preference for a specified fold. We have chosen to design sequences for a naturally occurring structural motif, the WW domain, using a backbone structure derived from the Pin1 protein. The use of a detailed backbone structure from an existing protein has many advantages. First, the existence of the fold in nature guarantees that design success is indeed possible. Second, the sequence and thermodynamic characteristics of the wild-type sequence provide useful reference points for the extent of success of the design procedure. Finally, when the target motif is a member of a large natural family, statistical information derived from the family members can be used to discover weaknesses in the design process.
The WW domain is a small natural module that is in many cases capable of folding autonomously. It is thought to be primarily responsible for binding to proline-containing regions of partner proteins (Otte et al. 2003). The domain is comprised of only 3440 amino acids and adopts a three-stranded antiparallel ß-sheet structure (Macias et al. 1996; Koepf et al. 1999b; Deechongkit and Kelly 2002). We have used the backbone structure of the WW domain from human peptidyl-prolyl cistrans isomerase (hPin1; Ranganathan et al. 1997) as our framework. The advantages of using the WW domain from hPin1 are its two-state folding kinetics and its ability to fold into the desired structure after single-site mutagenesis at most positions (Jäger et al. 2001). The primary structure and fold of the hPin1 WW domain are given in Table 1
and Figure 1
, respectively. We report the characterization of two polypeptides resulting from the implementation of our algorithm on the WW backbone scaffold. Thermodynamic measurements and structural inspection demonstrated that at least one of these two candidate sequences adopted preferentially the target WW fold.
|
|
| Results and Discussion |
|---|
|
|
|---|
The backbone ensemble used in the design process was derived from the WW domain backbone coordinates contained in 1PIN
[PDB]
. To begin the creation of each new backbone, a Monte Carlo expansion and contraction algorithm was applied by which initial perturbation by ±5° of
and
backbone angles at all positions in the structure was followed by refinement of the angles until the backbone reached a preset maximal root mean squared deviation (r.m.s.d.) value relative to the reference backbone. The refinement procedure involved a perturbation of a single, randomly chosen backbone angle by ±1°. If the r.m.s.d. decreased, the move was accepted; if the r.m.s.d. increased, the move was rejected. For the ensemble used herein, the maximal r.m.s.d. value was set at 0.30 Å; when the total r.m.s.d. reached this cutoff, the refinement procedure was stopped for that backbone model and the procedure was begun again with the wild-type backbone to obtain a different perturbed backbone model. Thirty nondegenerate backbones, all related to the initial experimentally determined backbone structure by less than 0.30 Å, were thus created. Figure 2A
depicts a typical backbone ensemble derived from hPin1 and obtained by this procedure.
|
A set of partition functions {Qx,r,i} for all amino acid rotamers at all positions in the protein defined a probabilistic fitness matrix:
![]() | (1) |
where x is the amino acid type, s is a sub-rotamer state of rotamer r of amino acid x, i is the position in the structure, m is the model under evaluation, and N is the total number of models used. Ex,r,s,i,m is the total calculated energy, according to the SPA scoring function, of the specific model under evaluation given the current sub-rotamer of amino acid type x at position i. The partition function for each amino acid and rotamer combination was continually updated as more backbone structures and models under evaluation were added to the simulation. Because each model contained a unique configuration of backbone structure, side chain identities, and rotamers, each rotamer state was exposed to a wide range of environments.
To ensure self-consistency in the final probability matrix and free energy values, three cycles of design and sampling were performed over the backbone ensemble. The Qx,r,i values from each cycle, representing the cumulative probability of each rotamer state, were used to guide the next cycle of design simulations by serving as a probabilistic selection matrix for amino acids and rotamers. Figure 2B
illustrates the probabilities of all allowed amino acid rotamer states as calculated using the SPANS method on the hPin1 WW backbone. The single sequences designed herein were based directly on the amino acid with the highest calculated probability at each position.
The results from the SPANS simulations are matrices detailing mean field-free energies for every residue and rotamer combination. Although the Qx,r,i values represent probabilities useful for designing individual sequences, such matrices can also readily be used to generate libraries of protein sequences consistent with the fold.
Designed WW domains
The application of SPANS to backbone ensembles derived from the hPin1 WW domain led to two amino acid sequences, SPANSWW1 and SPANSWW2 (Table 1
). Theoretical pIs were calculated to be 6.3 (SPANSWW1) and 4.8 (SPANSWW2), compared to the hPin1 WW value of 10.4. As is typical for a well-parameterized protein design algorithm (Kuhlman and Baker 2000; Raha et al. 2000), the designed sequences have notable similarities to natural WW domains and similar secondary structure propensities as calculated by PsiPred (Jones 1999). The levels of identity, however, are at most 35% with the hPin1 sequence.
Homology between the designed proteins and the WW family occurred primarily in and around the first hydrophobic cluster (Fig. 1B
). Two of the three conserved prolines (3 and 32) were consistently selected. This was most likely due to the key role of prolines in forming a hydrophobic core without burying an unprotected backbone nitrogen atom and in fixing the backbone at those positions. In addition, the common WW pattern of three consecutive aromatics was preserved in the designed proteins (YYY versus YYF). Also conserved in the designed sequences is the Asn at position 21, which makes several hydrogen bonding interactions with surrounding backbone atoms and is important for stability of the domain (Jäger et al. 2001). Interestingly, the free energy matrices rank Asp as the next most probable residue type at this position, consistent with its prevalence in natural WW domains. The preservation of most of the highly conserved WW domain residues in the designed sequences suggests that the subtle balance among factors that stabilize or destabilize folding is reasonably well approximated by the scoring functions used in SPA. Although each designed protein maintained the structural Trp participating in the hydrophobic core (Trp6), neither sequence contained the functional Trp (Trp29). In the wild-type protein, Trp29 is significantly solvent-exposed (Fig. 1A
) and is involved directly in binding to the peptide substrate. Because substrate binding was not a constraint of the simulations, there was no particular driving force for the selection of a Trp at position 29; interestingly, the placement of a Trp at this location in the simulations yielded a low theoretical calculated energy within the WW matrix. The absence of Trp29 from the designed sequences has implications for the interpretation of the spectroscopic data, as explained later.
The WW domain from human Pin1
The hPin1 WW and, to a lesser extent, the WW domain from Yes-associated protein (hYap), exhibit a strong positive peak in their circular dichroism spectra (Koepf et al. 1999b; Macias et al. 2000; Deechongkit and Kelly 2002; Nguyen et al. 2003). For the hPin1 WW domain, the maximum occurs at 226 nm, as shown in Figure 3
. This distinctive feature is useful as a qualitative signature of the WW fold. However, mutation of the functional Trp to Ala (W29A) in the hPin1 sequence yielded a significant decrease in the intensity of this positive band under conditions strongly favoring the folded (N) state. Thus, the main portion of the 226-nm signal in the native protein was attributed to the presence of Trp29, and variations in the spectrum were expected depending on the residue occupying position 29. The WW domain from hPin1 exhibited a cooperative thermal denaturation when studied by circular dichroism (Fig. 4
). The midpoint of the transition (Tm) was approximately 55°C. The W29A replacement resulted in a decrease in Tm of ~4°C.
|
|
|
Pro (K8P) was introduced in the wild-type hPin1 protein. The CD data collected on the mutant revealed the 230-nm feature typical of wild-type hPin1 WW domain (shown in Supplemental Material). The K8P replacement led to a destabilization comparable to that of the W29A replacement mentioned earlier, causing a drop in Tm of ~2°C. These results suggested that the presence of Pro8 could be tolerated in a WW-type fold and that its choice by SPANS in the designed protein was reasonable. Further biophysical characterization of SPANSWW1 was not attempted because of the low-intensity 230-nm signal, the shallow thermal denaturation, and the poor solubility, all of which supported that the target fold, although likely sampled, was not highly populated.
SPANSWW2
SPANSWW2 (Fig. 6
) exhibited a maximum in ellipticity at ~230 nm, similar in intensity to the W29A hPin1 mutant signal but shifted to the red by 5 nm (Fig. 3
). The thermal denaturation monitored by CD indicated a reversible and moderately sigmoidal transition with a Tm near 20°C (Fig. 5B
), in a marked improvement over SPANSWW1. The curve paralleled that of the target WW domain (Fig. 4
) and further structural characterization was therefore attempted by NMR spectroscopy.
|
|
H-CßH2 moieties of these residues were identified through NOEs to the ring C
Hs. In a hint of a hydrophobic cluster stabilized by long-range interactions, dipolar contacts were observed between one of the tyrosine rings and Trp6. Sequential effects in the 1820 stretch identified this residue as the central tyrosine of the YYY triad. From these ring assignments and the model of SPANSWW2 (Fig. 6
H-CßH2 spin system. NOEs between this side chain and Trp6, but not Tyr19 or the shifted proline, were consistent with assignment to Asn21. This was confirmed through sequential effects to Tyr20. One methyl group of a valine residue was found in contact with the shifted ß proton of Asn21; this valine was the first of a sequential VV pair and therefore, identified as Val26. Additional assignments followed by a nearest neighbor approach (a list is provided in the Electronic Supplemental Material). Two networks of contacts were identified: centered around Trp6, including Tyr19, Asn21, Val26, Thr28, Pro32, and centered around Tyr18, including Leu9, Lys11, Val27, and Tyr20. It is important to note that the NOEs and chemical shifts were consistent with a specific and long-lived arrangement of the side chains, indicating the preponderance of certain rotameric states. For several resolved core residues, and as far as could be determined at this level of analysis, the preferred state appeared to agree with that targeted by the algorithm. The formation of distinct hydrophobic clusters in SPANSWW2 is in contrast with the molten-globular species often obtained for designed proteins when side chain packing is not optimal.
The secondary structure of folded SPANSWW2 could also be characterized with the NMR data. More than a third of the C
protons were found downfield from 4.7 ppm, reflecting the presence of ß secondary structure. Accordingly, C
Hi-NHi+1 NOEs were readily followed in three extended strands, encompassing residues 6 to 12, 18 to 22, and 26 to 29. A portion of the NOESY fingerprint region is shown in Figure 8
to illustrate the connectivities involving several of these protons. Furthermore, C
H-C
H NOEs, such as those shown in Figure 9
, determined the register of the three strands and confirmed the formation of the WW sheet. The strongest observed dipolar connectivities are schematized in Figure 10
. NHi-NHi+1 NOEs were detected from residue 23 to residue 26, in the turn between the second and third strands. The conformational bias for the intended structure is further supported by a remarkable correspondence of C
H chemical shifts for similar residues in SPANSWW2 and the apo hPin1 WW domain (Kowalski et al. 2002). Examples include Trp6 (5.21 ppm in SPANSWW2 and 5.27 ppm in apo hPin1 WW domain); Tyr19 (5.51 ppm and 5.30 ppm); and Tyr/Phe20 (6.08 ppm and 5.69 ppm). For each of these C
H protons, the random coil value is ~4.42 ppm. Figure 11
summarizes the differences between observed and random coil C
H shifts in both peptides.
|
|
|
|
Ser weakens the hydrogen bond between the backbone carbonyl oxygen of residue 4 and the side chain of Asn21. Although Pro32 was found in the proximity of Trp6, the ring current shifts did not match the predicted values for all protons. These and other peculiarities of the sequence may be responsible for differences in the terminal region. The calculation of a high-resolution structure with proton data was not feasible, in part because of the persistence of slow conformational exchange, and in part because of overlap from residues in the loops; however, the sequential and interresidue NOEs and the shifts demonstrated that the majority of the polypeptide molecules adopted on average a secondary and tertiary structure similar to those of the WW target.
In evaluating the SPANSWW2 results, it is worth pointing out that not all naturally occurring WW domains are stable. A recent survey reports that at 12°C, only 30 of 42 naturally occurring WW domains were fully folded in the absence of ligand (Otte et al. 2003). Slow conformational exchange on the chemical shift time scale was observed for the Trp side chains of some unstable WW domain peptides (Ferguson et al. 2001; Otte et al. 2003), as was detected here for Trp6 in SPANSWW2 in TOCSY and NOESY data. Thus, the algorithm identified a sequence coding for both realistic thermodynamic stability and activation energy barrier to conformational changes disrupting the folded state. Some natural WW domains are known for the complexity of their folding transition (Koepf et al. 1999b; Crane et al. 2000; Karanicolas and Brooks III 2003; Kuznetsov et al. 2003). The sigmoidal unfolding profile observed for SPANS WW2 (Fig. 5B
) is suggestive of a cooperative process. However, additional characterization is necessary for a more complete interpretation of this behavior and to determine the statedness of the transition.
SPANSWW2 and SPANSWW1 versus the hPin1 WW domain
Both designed proteins exhibited decreased stability compared to the WW domain from hPin1. Interestingly, they contained none of the severely destabilizing wild-type mutations identified by Gruebele and co-workers (Jäger et al. 2001). Two types of interactions appear crucial in stabilizing the native state of the hPin1 WW domainhydrophobic, in two segregated clusters, and H-bonding, in an extended network. Remarkably, almost all of the residues involved in the two hydrophobic clusters (Leu2, Trp6, Tyr19, Pro32 in cluster 1, and Tyr18, Phe20, and the n-propyl group of Arg9 in cluster 2; Fig. 1
) were present in the designed sequences. SPANSWW1 had one mutation (F20Y) to this hydrophobic cluster; SPANSWW2 had this mutation and another (R9L). The F20Y mutation is conservative and may exert little effect on the stability of the fold. The R9L mutation in SPANSWW2 is less conservative; although the hydrophobicity of this side chain is similar to that of Arg. The van der Waals interactions between residue 9 and the rest of cluster 2 may be weakened because of the Leu9 rotameric state selected in the SPANSWW2 matrix. Both of the designed proteins exhibited similar basic hydrogen-bonded interactions detailed to be important for structural stability in the wild-type polypeptide; the largest deviation involved the loss of a postulated interaction between Glu7 and Arg9 (Jäger et al. 2001). The designed proteins also had a mutation of the functional, solvent-exposed Trp (W29N in both); a W29A mutation in the wild type was shown to be moderately destabilizing (3 kJ mole-1 < 
G° < 6 kJ mole-1), possibly because of the loss of a cationaromatic interaction with Arg16 or an aromaticaromatic interaction with Tyr18 (Jäger et al. 2001). Asn in this position may have the same effect, as it cannot participate in these interactions.
It remains unclear why SPANSWW2 populated a typical WW-like fold, whereas SPANSWW1 did not to any great extent. The side chain hydrogen bonding networks predicted for WW2 were more dispersed throughout the protein structure than those for WW1; perhaps these additional hydrogen bonds available to WW2 served to select for the target structure and maintain it at higher temperatures. This may indicate that the treatment of rotamers used in the design procedure for WW2 was more effective than that used for WW1.
Relationship to other ß protein design
It is instructive to compare our WW domain results with other successful designs and the properties of "mini-proteins." Andersen and co-workers prepared a 20-residue polypeptide capable of folding into a stable "Trp-cage" motif (Neidigh et al. 2002). Unlike the WW domain, this small motif contains an
structure, but it has in common with it essential hydrophobic interactions between Trp and Pro residues. This type of interaction may play an important role in stabilizing the folded state of several small proteins (Gellman and Woolfson 2002; Neidigh et al. 2002).
Production of small ß proteins has been demonstrated in several laboratories (de Alba et al. 1996; Ramirez-Alvarado et al. 1996; Schenk and Gellman 1998; Lopez de la Paz et al. 2001; Ottesen and Imperiali 2001). Key to the stabilization of the fold is the engineering of proper reverse turns; the use of D-amino acids has been particularly helpful to this end (Schenk and Gellman 1998; Ottesen and Imperiali 2001), as has the placement of disulfide bridges at or near turns to increase the propensity of the desired fold (Ottesen and Imperiali 2001). In addition, this body of work indicates the importance of cross-strand interactions and judicious exposure of hydrophobic side chains to solvent to define a single conformation by limiting the ability of the strands to alter register.
The procedure applied here demonstrates a simple means to include backbone flexibility in protein design. With this method, residues found to be key to the fold are still retrieved but excessive steric constraints are relieved and the peptide can adjust to satisfy nonbonding interactions. The cumulative backbone displacements are likely to distort the structure, at least locally, compared to the target, but the desired fold is attained. Although we have not demonstrated that the use of backbone flexibility is essential for successful full sequence design, it appears to be an appropriate method for mitigating the errors inherent in the more frequently used fixed backbone assumption. It is also arguably more realistic. In addition, the sampling methods we describe provide a new avenue for treatment of more sophisticated design problems. Examples include the design of proteins with specifically tuned dynamic properties, and the design of sequences that are compatible with a number of discrete conformational states.
Full sequence design remains a considerable challenge for the protein design community. The primary difficulty is the identification of one or more suitable sequences out of the manifold possible sequences for a given protein length. For the design problems treated herein, there were 2034 possible sequences from which to choose. A suitable sequence must preferentially adopt the desired fold versus myriad alternative folded, unfolded, and aggregated states. The data strongly suggest that our SPANS algorithm and backbone sampling methods have generated at least one novel WW domain, constituting a successful search through sequence space. This novel sequence folds autonomously into a state that is similar to structures adopted by wild-type WW domains. Despite this success, several issues remain for our design algorithm. First, only one of the two designed sequences described herein convincingly adopts the target WW domain structure. Second, although this sequence is more stable than many naturally occurring WW domains, it is considerably less stable than the wild-type WW domain from which the target structure was derived. This result indicates that deficiencies remain in the scoring functions or sampling methods, or both, used to generate the sequences. It is hoped that future developments focusing on the quality of scoring functions and a systematic analysis of the backbone degrees of freedom will lead to further improvements in predictive capacity.
| Materials and methods |
|---|
|
|
|---|
competent cells, which were originally purchased from Invitrogen Life Technologies as a glycerol stock; and chemicals were purchased from J.T. Baker (Philipsburg, NJ) and used without further purification. A novel fusion system was developed for the expression and purification of the WW peptides (WW from hPin1 as well as SPANSWW1 and -WW2), which used the high expression of the amino terminus of a D78Y mutant of calmodulin from Homo sapiens (N-Cam.Y) in Escherichia coli and the ease of purification of N-Cam systems through Ca2+-dependent binding to phenyl Sepharose. A linker was placed in a position amino-terminal to the WW gene product for proteolytic purposes; its sequence was ENLYFQ/GS, where / indicates the site of protease cleavage by the mutant (His)6-tagged NIa-Pro tobacco etch virus protease. The NIa cleavage system was chosen because of its cleavage to ~95% completion in most non-SDS buffers in 48 h at room temperature. This fusion system led to the ready expression and purification of all of the WW designs discussed herein.
Synthetic oligonucleotides were used to construct inserts that coded for the proteins of interest. The assembled genes were then cloned into a pAED4-derived fusion system 3' to the gene coding for N-Cam.Y and the NIa linker region. Proteins were expressed in BL21(DE3)pLysS E. coli cells or HMS174(DE3) pLysS E. coli cells to the same approximate level. All were localized in the soluble portion of the cell. Expressed fusion proteins were purified using calcium-dependent affinity of N-Cam.Y for phenyl Sepharose resin; the supernatant after cell lysis was applied at low pressure using a BioRad LP system over a phenyl Sepharose column in the presence of 5 mM Ca2+ and 500 mM Na+ at pH 7.5. Once the background cellular proteins had been washed from the column, the fusion protein was eluted with a 5 mM EDTA and 500 mM Na+ solution at pH 7.5. The NIa protease was then added in a 1:100 or 2:100 protease-to-protein solution volume ratio in the EDTA buffer; this reaction was allowed to proceed for 2 to 3 days at room temperature. Cleavage of the proteins with the NIa protease was estimated to be ~90%95% complete as determined by SDS-PAGE.
The cleaved WW domains were isolated using reversed phase HPLC or further phenyl Sepharose purification (where in this instance the WW design was contained in the eluent from the first wash). In the case of the former purification scheme, the column was washed with 100% HPLC-grade water with 0.1% TFA. A gradient of 90% acetonitrile (Burdick and Jackson)/10% HPLC-grade water with 0.1% TFA was initialized. The fractions of interest were lyophilized and the solid material resuspended in distilled water. The identity and integrity of the designed proteins were qualitatively analyzed by SDS-PAGE and confirmed by electrospray mass spectrometry. Concentrations of the proteins were calculated using the absorbance at 280 nm in nondenaturing conditions and the absorbance coefficient as calculated from the primary structure.
Mutations were made as follows. Primers were designed with high complementarity to the surrounding regions of the gene but alterations (to reflect the desired mutation) in the region of the codon to be mutated. The primers were designed to maximize Tm (by increasing the GC content) and minimize the amount of noncomplementarity of the primer with the template. Reaction mixtures were prepared of 35.7 µL of distilled water, 5 µL of 10x ThermoPol buffer, 5 µL of 10 ng/µL plasmid template, 1.4 µL of each primer at 10 µM, and 0.5 µL of Vent DNA polymerase, 2 U/µL. The PCR program consisted of 2 min at 95°C, and 16 cycles of 30 sec at 95°C followed by 55°C for 8.5 min. The reaction mixtures were cooled to 4°C. This mixture was digested with 1 µL of 20 U/µL DpnI and incubated at 37°C for 12 h. Two microliters of this reaction mixture was transformed into DH5
and plated. Several of the colonies were sequenced for the presence or absence of the mutation in question.
Spectroscopic characterization
The purified protein was characterized by CD on an Aviv CD model 62 DS (Lakewood, NJ). A positive peak near 230 nm is typical of the WW-type fold (Koepf et al. 1999a; Deechongkit and Kelly 2002). Circular dichroism spectra were, therefore, used to probe for the presence of the WW fold and the ~230 nm peak was followed to assess its thermal stability. The buffer conditions were 10 mM phosphate at pH 7.0 (or 10 mM MOPS at pH 7.2), and 100 mM NaCl. The protein concentrations were between 50 and 60 µM. The temperature of acquisition was 2°C to populate the native state. For the thermal denaturation experiments, the temperature was raised from 2°C to 98°C in 1.5°2°C steps. Equilibration time at each temperature was 0.11.5-min and averaging time was 790 sec depending on the protein. Reversibility was tested by recording data from 98°C to 2°C in the same stepwise fashion. The thermal denaturation profiles were fitted to a Gibbs-Helmholtz free-energy equation using a nonlinear least-squares routine (Nfit, University of Texas, Galveston, TX), as given in equation 2
:
![]() | (2) |
In this equation, Y(T) represents the observed signal at any temperature; YN(T) represents the signal of the native state; YU(T) represents the signal of the unfolded state; and the thermodynamic quantities are for the N
U reaction. The subscript m indicates quantities at the midpoint of the transition. The absence of a well-defined folded state baseline prevented a determination of thermodynamic quantities and only estimates of Tm, a relatively robust parameter, are reported. Because no conditions were found under which the folded state was fully populated, chemical denaturation was expected to show an equally ill-defined native baseline and was not attempted to obtain a standard free energy of unfolding.
SPANSWW2 exhibited an intriguing pH-dependent behavior at high concentration of protein. When it was resuspended after lyophilization and the pH was raised above the pI, reduced positive ellipticity at 230 nm and greatly increased turbidity and viscosity were observed. To break the aggregates, the samples were briefly heated to 95°C and allowed to cool gradually to room temperature. This annealing process restored a clear solution and the typical 230-nm peak. The trapping of the peptide in multimeric structure is reminiscent of the metastable ß network observed in amyloid peptides and in some proteins under non-native conditions (Dobson 1999).
For 1H NMR spectroscopy, solutions of SPANSWW2 (700900 µM) were prepared in 90% H2O/10% D2O or 100% D2O by resuspension of the lyophilized powder. Several annealing steps were necessary when raising the pH above the theoretical pI of the protein. One annealing cycle was comprised of placing the sample on a 90°C heat block for 5 min, allowing the system to cool to room temperature, and adjusting the pH again. The pH(*) was measured with a Beckman
71 pH meter equipped with a small bore probe (Mettler Toledo, Columbus, OH) and adjusted to 6.87.2 with HCl (DCl) or NaOH (NaOD). Sodium azide (Sigma, ~3 mM) was added for inhibition of bacterial growth and the buffer was adjusted to 20 mM phosphate for the D2O experiments. Data were collected at 600 MHz on a Bruker DRX spectrometer (Billerica, MA) with probe temperature set between 5° and 35°C. All homonuclear 2D data were obtained at 5°C for optimal population of the folded state. NOESY (Kumar et al. 1980), DQF-COSY (Rance et al. 1983) and relaxation-compensated TOCSY (Cavanagh and Rance 1992) data were collected with TPPI quadrature detection in the indirect dimension (10 kHz spectral width, 4096* x 512 points, 64 transients). The mixing time was 120 msec for the NOESY data, and 45 msec for the TOCSY data, with a DIPSI-2 spin locking pulse train (Shaka et al. 1988). Suppression of the water signal was achieved with a 1.2-sec low-power saturation pulse or a modified WATERGATE sequence (Piotto et al. 1992; Sklená
et al. 1993). Timedomain data were subjected to multiplication with a 45°-shifted squared sine bell function before transformation. Final matrix size was 2048 x 2048 real. For the 1D variable temperature study in D2O, the temperature of the probe was decreased from 35°C to 5°C in 5°C increments. A total of 256 to 4000 transients were collected per temperature with other parameters as above. Chemical shifts were referenced indirectly to DSS through water after correction for temperature (Wishart et al. 1995).
| Electronic supplemental material |
|---|
|
|
|---|
H shifts in SPANSWW2 and apo hPin1 WW domain. The information is within a single pdf file (cmkp-esm.pdf).
| Acknowledgments |
|---|
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| References |
|---|
|
|
|---|
-helix but do not alter protein stability. Science 239: 631 635.Betz, S.F., Liebman, P.A., and DeGrado, W.F. 1997. De novo design of native proteins: Characterization of proteins intended to fold into antiparallel, rop-like, four-helix bundles. Biochemistry 36: 24502458.[CrossRef][Medline]
Bryson, J.W., Desjarlais, J.R., Handel, T.M., and DeGrado, W.F. 1998. From coiled coils to small globular proteins: Design of a native-like three-helix bundle. Protein Sci. 7: 14041414.[Abstract]
Cavanagh, J. and Rance, M. 1992. Suppression of cross-relaxation effects in TOCSY spectra via a modified DIPSI-2 mixing sequence. J. Magn. Reson. 96: 670678.
Crane, J.C., Koepf, E.K., Kelly, J.W., and Gruebele, M. 2000. Mapping the transition state of the WW domain ß-sheet. J. Mol. Biol. 298: 283 292.[CrossRef][Medline]
Dahiyat, B.I., and Mayo, S.L. 1997. De novo protein design: fully automated sequence selection. Science 278: 8287.
de Alba, E., Jimenez, M.A., Rico, M., and Nieto, J.L. 1996. Conformational investigation of designed short linear peptides able to fold into ß-hairpin structures in aqueous solution. Fold. Des. 1: 133144.[CrossRef][Medline]
Deechongkit, S. and Kelly, J.W. 2002. The effect of backbone cyclization on the thermodynamics of ß-sheet unfolding: stability optimization of the PIN WW domain. J. Am. Chem. Soc. 124: 49804986.[CrossRef][Medline]
Dobson, C.M. 1999. Protein misfolding, evolution and disease. Trends Biochem. Sci. 24: 329332.[CrossRef][Medline]
Ferguson, N., Pires, J.R., Toepert, F., Johnson, C.M., Pan, Y.P., Volkmer-Engert, R., Schneider-Mergener, J., Daggett, V., Oschkinat, H., and Fersht, A. 2001. Using flexible loop mimetics to extend
-value analysis to secondary structure interactions. Proc. Natl. Acad. Sci. 98: 13008 13013.
Gellman, S.H. and Woolfson, D.N. 2002. Mini-proteins Trp the light fantastic. Nat. Struct. Biol. 9: 408410.[CrossRef][Medline]
Harbury, P.B., Tidor, B., and Kim, P.S. 1995. Repacking protein cores with backbone freedom: Structure prediction for coiled coils. Proc. Natl. Acad. Sci. 92: 84088412.
Harbury, P.B., Plecs, J.J., Tidor, B., Alber, T., and Kim, P.S. 1998. High-resolution protein design with backbone freedom. Science 282: 1462 1467.
Jäger, M., Nguyen, H., Crane, J.C., Kelly, J.W., and Gruebele, M. 2001. The folding mechanism of a ß-sheet: the WW domain. J. Mol. Biol. 311: 373393.[CrossRef][Medline]
Jones, D.T. 1999. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292: 195202.[CrossRef][Medline]
Karanicolas, J. and Brooks III, C.L. 2003. The structural basis for biphasic kinetics in the folding of the WW domain from a formin-binding protein: Lessons for protein design? Proc. Natl. Acad. Sci. 100: 39543959.
Koepf, E.K., Petrassi, H.M., Ratnaswamy, G., Huff, M.E., Sudol, M., and Kelly, J.W. 1999a. Characterization of the structure and function of W
F WW domain variants: Identification of a natively unfolded protein that folds upon ligand binding. Biochemistry 38: 1433814351.[CrossRef][Medline]
Koepf, E.K., Petrassi, H.M., Sudol, M., and Kelly, J.W. 1999b. WW: An isolated three-stranded antiparallel ß-sheet domain that unfolds and refolds reversibly; evidence for a structured hydrophobic cluster in urea and GdnHCl and a disordered thermal unfolded state. Protein Sci. 8: 841 853.[Abstract]
Kortemme, T., Ramirez-Alvarado, M., and Serrano, L. 1998. Design of a 20-amino acid, three-stranded ß-sheet protein. Science 281: 253256.
Kowalski, J.A., Liu, K., and Kelly, J.W. 2002. NMR solution structure of the isolated Apo Pin1 WW domain: Comparison to the x-ray crystal structures of Pin1. Biopolymers 63: 111121.[CrossRef][Medline]
Kraemer-Pecore, C.M. 2002. "Computational design and experimental verification of protein domains." Ph.D. thesis. The Pennsylvania State University, University Park, PA.
Kraemer-Pecore, C.M., Wollacott, A.M., and Desjarlais, J.R. 2001. Computational protein design. Curr. Opin. Chem. Biol. 5: 690695.[CrossRef][Medline]
Kraulis, P. 1991. MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24: 946950.[CrossRef]
Kuhlman, B. and Baker, D. 2000. Native protein sequences are close to optimal for their structures. Proc. Natl. Acad. Sci. 97: 1038310388.
Kumar, A., Ernst, R.R., and Wüthrich, K. 1980. A 2D NOE experiment for the elucidation of complete protonproton cross-relaxation networks in biological macromolecules. Biochem. Biophys. Res. Commun. 95: 16.[CrossRef][Medline]
Kuznetsov, S.V., Hilario, J., Keiderling, T.A., and Ansari, A. 2003. Spectroscopic studies of structural changes in two ß-sheet-forming peptides show an ensemble of structures that unfold noncooperatively. Biochemistry 42: 43214332.[CrossRef][Medline]
Larson, S.M., Garg, A., Desjarlais, J.R., and Pande, V.S. 2003. Increased detection of structural templates using alignments of designed sequences. Proteins 51: 390396.[CrossRef][Medline]
Lopez de la Paz, M., Lacroix, E., Ramirez-Alvarado, M., and Serrano, L. 2001. Computer-aided design of ß-sheet peptides. J. Mol. Biol. 312: 229 246.[CrossRef][Medline]
Macias, M.J., Hyvonen, M., Baraldi, E., Schultz, J., Sudol, M., Saraste, M., and Oschkinat, H. 1996. Structure of the WW domain of a kinase-associated protein complexed with a proline-rich peptide. Nature 382: 646649.[CrossRef][Medline]
Macias, M.J., Gervais, V., Civera, C., and Oschkinat, H. 2000. Structural analysis of WW domains and design of a WW prototype. Nat. Struct. Biol. 7: 375379.[CrossRef][Medline]
Neidigh, J.W., Fesinmeyer, R.M., and Andersen, N.H. 2002. Designing a 20-residue protein. Nat. Struct. Biol. 9: 425430.[CrossRef][Medline]
Nguyen, H., Jäger, M., Moretto, A., Gruebele, M., and Kelly, J.W. 2003. Tuning the free-energy landscape of a WW domain by temperature, mutation, and truncation. Proc. Natl. Acad. Sci. 100: 39483953.
Otte, L., Wiedemann, U., Schlegel, B., Pires, J.R., Beyermann, M., Schmieder, P., Krause, G., Volkmer-Engert, R., Schneider-Mergener, J., and Oschkinat, H. 2003. WW domain sequence activity relationships identified using ligand recognition propensities of 42 WW domains. Protein Sci. 12: 491 500.
Ottesen, J.J. and Imperiali, B. 2001. Design of a discretely folded mini-protein motif with predominantly ß-structure. Nat. Struct. Biol. 8: 535539.[CrossRef][Medline]
Pessi, A., Bianchi, E., Crameri, A., Venturini, S., Tramontano, A., and Sollazzo, M. 1993. A designed metal-binding protein with a novel fold. Nature 362: 367369.[CrossRef][Medline]
Piotto, M., Saudek, V., and Sklená
, V. 1992. Gradient-tailored excitation for single-quantum NMR spectroscopy of aqueous solutions. J. Biomol. NMR 2: 661665.[CrossRef][Medline]
Quinn, T.P., Tweedy, N.B., Williams, R.W., Richardson, J.S., and Richardson, D.C. 1994. ßdoublet: De novo design, synthesis, and characterization of a ß-sandwich protein. Proc. Natl. Acad. Sci. 91: 87478751.
Raha, K., Wollacott, A.M., Italia, M.J., and Desjarlais, J.R. 2000. Prediction of amino acid sequence from structure. Protein Sci. 9: 11061119.[Abstract]
Ramirez-Alvarado, M., Blanco, F.J., and Serrano, L. 1996. De novo design and structural analysis of a model ß-hairpin peptide system. Nat. Struct. Biol. 3: 604612.[CrossRef][Medline]
Rance, M., Sørensen, O.W., Bodenhausen, G., Wagner, G., Ernst, R.R., and Wüthrich, K. 1983. Improved spectral resolution in COSY 1H NMR spectra of proteins via double quantum filtering. Biochem. Biophys. Res. Commun. 117: 479485.[CrossRef][Medline]
Ranganathan, R., Lu, K.P., Hunter, T., and Noel, J.P. 1997. Structural and functional analysis of the mitotic rotamase Pin1 suggests substrate recognition is phosphorylation dependent. Cell 89: 875886.[CrossRef][Medline]
Regan, L. and DeGrado, W.F. 1988. Characterization of a helical protein designed from first principles. Science 241: 976978.
Schenk, H.L. and Gellman, S.H. 1998. Use of a designed triple-stranded antiparallel ß-sheet to probe ß-sheet cooperativity in aqueous solution. J. Am. Chem. Soc. 120: 48694870.[CrossRef]
Shaka, A.J., Lee, C., and Pines, A. 1988. Iterative schemes for bilinear operators; application to spin decoupling. J. Magn. Reson. 77: 274293.
Sklená
, V., Piotto, M., Leppik, R., and Saudek, V. 1993. Gradient-tailored water suppression for 1H-15N HSQC experiments optimized to retain full sensitivity. J. Magn. Reson. 102: 241245.[CrossRef]
Walsh, S.T., Cheng, H., Bryson, J.W., Roder, H., and DeGrado, W.F. 1999. Solution structure and dynamics of a de novo designed three-helix bundle protein. Proc. Natl. Acad. Sci. 96: 54865491.
Wishart, D.S., Bigam, C.G., Yao, J., Abildgaard, F., Dyson, H.J., Oldfield, E., Markley, J.L., and Sykes, B.D. 1995. 1H, 13C and 15N chemical shift referencing in biomolecular NMR. J. Biomol. NMR 6: 135140.[Medline]
Wollacott, A.M. and Desjarlais, J.R. 2001. Virtual interaction profiles of proteins. J. Mol. Biol. 313: 317342.[CrossRef][Medline]
Yan, Y. and Erickson, B.W. 1994. Engineering of betabellin 14D: Disulfide-induced folding of a ß-sheet protein. Protein Sci. 3: 10691073.[Abstract]
Zhou, N.E., Kay, C.M., and Hodges, R.S. 1994. The role of interhelical ionic interactions in controlling protein folding and stability. De novo designed synthetic two-stranded
-helical coiled-coils. J. Mol. Biol. 237: 500512.[CrossRef][Medline]