|
|
||||||||
-helical model peptides. Fingerprint of the 20 naturally occurring amino acids
1 Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853-1301, USA
2 Universidad Nacional de San Luis, Facultad de Ciencias Físico, Matemáticas y Naturales, Instituto de Matemática Aplicada San Luis, CONICET, San Luis, Argentina
3 Universidad Nacional de San Luis, Departamento de Química, San Luis, Argentina
Reprint requests to: Harold A. Scheraga, Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853-1301, USA; e-mail: has5{at}cornell.edu; fax: (607) 254-4700.
(RECEIVED June 12, 2004; FINAL REVISION July 28, 2004; ACCEPTED August 5, 2004)
| Abstract |
|---|
|
|
|---|
-helical model peptides represented by the sequence Ac-(Ala)i-X-(Ala)j-NH2, where X represents any of the 20 naturally occurring amino acids, with 0
i
8 and i + j = 8. Adoption of the locally dense basis approach for the quantum chemical calculations enabled us to reduce the length of the chemical-shift calculations while maintaining good accuracy of the results. For the 20 naturally occurring amino acids in
-helices, there is (1) significant variability of the computed 13C shielding as a function of both the guest residue (X) and the position along the sequence; for example, at the N terminus, the 13C
and 13C
shieldings exhibit a uniform pattern of variation with respect to both the central or the C-terminal positions; (2) good agreement between computed and observed 13C
and 13C
chemical shifts in the interior of the helix, with correlation coefficients of 0.98 and 0.99, respectively; for 13C
chemical shifts, computed in the middle of the helix, only five residues, namely Asn, Asp, Ser, Thr, and Leu, exhibit chemical shifts beyond the observed standard deviation; and (3) better agreement for four of these residues (Asn, Asp, Ser, and Thr) only for the computed values of the 13C
chemical shifts at the N terminus. The results indicate that 13C
, but not 13C
, chemical shifts are sensitive enough to reflect the propensities of some amino acids for specific positions within an
-helix, relative to the N and C termini of peptides and proteins.
Keywords: 13C chemical shifts;
-helical peptides; helix breaker; locally dense basis approach
| Introduction |
|---|
|
|
|---|
and 13C
chemical shifts in polypeptides are influenced mainly by backbone geometry and, therefore, are valuable quantities with which to identify secondary structure (Spera and Bax 1991; Kuszewski et al. 1995; Iwadate et al. 1999), with no influence of amino acid sequence (Iwadate et al. 1999; Xu and Case 2002). The influence of the backbone geometry appears as an approximately 4 ppm and 2 ppm chemical-shift separation between
-helical and
-sheet residues from the corresponding experimental 13C
and 13C
chemical shifts, respectively (Spera and Bax 1991). Despite the enormous effort and progress in this field during the last few years (Iwadate et al. 1999; Wishart and Case 2001; Xu and Case 2001 Xu and Case 2002; Oldfield 2002), a detailed characterization of the factors affecting the 13C
and 13C
chemical shifts for each of the 20 naturally occurring amino acids remains to be elucidated. As noted by Xu and Case (2002), "deciphering such effects from empirical data alone is a complex undertaking." For this reason, Iwadate et al. (1999) and Xu and Case (2002), using semiempirical and ab initio calculations, respectively, carried out an analysis of the factors that affect the 13C
, 13C
, and 13C' chemical shifts. Among other factors that influence the shielding of these nuclei, Iwadate et al. (1999) discussed the effect of a backbone hydrogen bond on the 13C chemical shifts, concluding that hydrogen bonding has a small but nonnegligible influence. They also illustrated the importance of such an effect, noting the dependence of 13C chemical shifts on the position within
-helical structures. Later, Xu and Case (2002) extended this analysis by using quantum mechanical calculations on
-sheet and
-helical model peptides. However the position dependence within the
-helix for each of the 20 naturally occurring amino acids was not computed.
Figure 1
shows that there are clearly differences between the ends and the interior of an
-helical structure, that is, the end residues involve different numbers of intramolecular hydrogen bonds than the interior residues. Thus, certain questions arise and are addressed here: (1) How far into the interior does the end effect penetrate? (2) Are these effects amino-acid specific? and (3) How do these effects influence the 13C
and 13C
chemical shifts for each of the 20 naturally occurring amino acids? Furthermore, it is necessary to demonstrate whether or not the position dependence of 13C chemical shifts reflects the propensities observed for some amino acids to occupy the N or the C terminus of
-helices in proteins and peptides. If this correlation exists, its characterization could play a very important role (1) for refinement of NMR-derived 3D models, and (2) for accurate prediction of the secondary structure of a protein when the 13C chemical shift assignments and the amino acid sequence are the only available information.
|
-helical and non-
-helical regions of the various dipeptide types appearing in a sample of seven proteins of known sequence and structure. They reported "...a sharp change in the nature of the observed dipeptide types when the helix-coil boundary is crossed..." These results were interpreted as evidence for the importance of short-range interactions in determining protein conformation. This concept of the dominance of short-range interactions (Scheraga 1973) was the basis for the hostguest approach to determine the helix-forming propensity of the 20 naturally occurring amino acids (Wojcik et al. 1990) and for the success (Lewis et al. 1970) in predicting the tendency toward helix formation in unfolded proteins (that are poised to fold) that matched the native structure; this concept later received direct experimental confirmation from fluorescence resonance energy transfer experiments (Navon et al. 2001).
Subsequently, Richardson and Richardson (1988) also showed that the observed distribution of amino acids at the N or the C terminus of
-helices is quite different from the interior positions. Almost simultaneously, a possible explanation of why amino acids residues occur with certain frequencies at these positions was suggested by Presta and Rose (1988) based on the depletion of hydrogen bonds at the ends (Fig. 1
). Following the suggestions mentioned above, the distribution of amino acids at these sites has been examined in peptides (Nicholson et al. 1988; Bruch et al. 1991; Chakrabartty et al. 1993; Doig and Baldwin 1995) and proteins (Serrano and Fersht 1989; Lecomte and Moore 1991; Aurora et al. 1994; Zhukovsky et al. 1994).
From all this evidence, it is clear that both the amino acid composition and the structural properties at the N and C termini of the
-helical structure, shown in Figure 1
, must be taken into account in the computation of the 13C chemical shift. As was already noted, for the terminally blocked Ac-(Ala)9-NH2 peptide, the first and last three amide hydrogens and carbonyl groups at the N and C termini, respectively, are not involved in i-(i + 4) backbone hydrogen bonds. The influence of the backbone hydrogen bond pattern, characteristic of the
-helix, on the 13C
and 13C
shielding for each of the 20 naturally occurring amino acids will be discussed. It is worth noting that the number of unsatisfied backbone hydrogen bonds is four if the sequence shown in Figure 1
were unblocked.
The Results and Discussion section of this paper is organized as follows: In section I, we use ab initio quantum chemical calculations on
-helical model peptides to compute the position dependence of the 13C chemical shift (here 13C
and 13C
) for each of the 20 naturally occurring amino acids by assuming that the residues are neutral, except for Aspartate at all positions for which we considered both protonation states. In section II, we analyze the effect of the state of ionization on the 13C chemical shift at the specific position in the middle of the helix for all the ionizable residues considering both charged and uncharged states. The expected different chemical shifts at the end and in the middle of the
-helix, and their correlation with the values observed by Wang and Jardetzky (2002) will also be discussed in section III. This analysis will focus on residues for which the differences between computed and observed values are beyond the standard deviation, namely for residues Asn, Asp, Ser, Thr, and Leu. The 13C chemical shifts for Gly and Pro, which represent the most flexible and the most constrained amino acid residue, respectively, will also be analyzed in this section.
The section on Concluding Remarks deals with a discussion of the results and their role in the characterization and refinement of proteins. Finally, in Materials and Methods, the convention and approaches adopted to describe the
-helical model peptides as well as to compute the quantum chemical 13C chemical shifts are described.
| Results and Discussion |
|---|
|
|
|---|
chemical shifts for each of the naturally occurring amino acids. Figure 2A
chemical shifts obtained for Lys as a function of the position of this residue along the
-helix. This figure shows that the 13C
chemical shifts of the residues at the N terminus, that is, the first turn of the helix, are more shielded (upfield shift) with respect to the central residues of the molecule, viz., for residues in positions 47. This trend of the data is observed for 18 out of 19 residues. The exception is Gly because the most deshielded (down-field shift) 13C
chemical shifts occur at position 1, as shown in Figure 2B
|
shielding shows a multifaceted behavior. Some amino acids (Lys, Arg, His, Gln, Glu, Leu, and Met) are deshielded (downfield shift), in agreement with the empirical observation of Iwadate et al. (1999); for others (Phe, Ile, Asn, Ala, Ser, Asp, Tyr, Thr, Val, and Cys), there are, on average, no significant changes, while for Trp there is a shielding of the 13C
atom.
For the N terminus, the 13C
shielding shows, as a characteristic pattern, that the first two residues are deshielded (downfield shift) with respect to the central residues of the molecule, viz., those occupying positions 47, as shown in Figure 3A
for Ala. On the other hand, when we move from the C terminus to the middle of the
-helix these position dependence changes do not exhibit a unique behavior. This is illustrated by comparing Figure 3B
for Ile with Figure 3A
for Ala. Based on this evidence, when we move from the C terminus to the N terminus, we can categorize the amino acids residues as (1) showing a monotonic shielding (upfield shift for Ala, Asn, Asp, Cys, Phe, and Ser), (2) a monotonic deshielding (downfield shift for Val, Thr, Met, and Ile), or (3) a nonregular behavior (for Arg, Gln, Glu, His, Leu, Lys, Trp, and Tyr); of course, Gly has no C
, and Pro is treated separately in section III.
|
and 13C
shielding computed at the N terminus shows a uniform pattern of variations with respect to the rest of the molecule, while in the middle of the helix, as well as for the C terminus, the pattern of changes is not uniform. This asymmetry observed with the position may constitute valuable information to assess positional preferences of the amino acids, as we will discuss below. The origin of this effect could be attributed to the already-noted different hydrogen-bonded character between the ends and the interior of the
-helical conformation (see Fig. 1
and 13C
chemical shifts. Figure 4
= [13C
max13C
min], with
=
or
, for each of the naturally occurring amino acids except for 13C
for Gly and proline. 13C
max and 13C
min represent the maximum and minimum chemical shift values computed for each residue X in the sequence Ac-(Ala)i-X-(Ala)j-NH2 (with i + j = 8; 0
i
8). The mean value and standard deviation for 
(with
=
or
, computed over all the naturally occurring amino acids are 1.92 ± 0.53 ppm and 2.25 ± 1.26 ppm for 13C
and 13C
chemical shifts, respectively. It is worth noting that the major factor affecting chemical shifts is the backbone geometry, inducing changes of about 4 ppm between
-helix and
-sheet conformations for 13C
and about 2 ppm for 13C
chemical shifts, respectively (Spera and Bax 1991). By comparison, the site dependence of the 13C chemical shift along the
-helix has a nonnegligible effect of up to about 3 ppm for the 13C
chemical shift (of Tyr) and up to about 6 ppm for the 13C
chemical shift (of Ile) as shown in Figure 4
|
and 13C
chemical shifts in the middle of the
-helix
) between the 13C chemical shift for both charged and uncharged side chain and the values observed by Wang and Jardetzky (2002) for the ionizable residues Asp, Arg, Glu, His, Tyr, and Lys were computed for the central residue of the
-helix, that is, with 
= [13C
comp 13C
obs], with
=
or
. Results for 
are shown in Figure 5
value for a charged side chain is larger than the observed standard deviation. However, for uncharged side chains, the computed values of 
are smaller than the observed standard deviations. This observation is in agreement with the result of Xu and Case (2002), who noted that often "electron distributions of charged molecules in water resemble those of the corresponding neutral species in the gas phase more closely than they resemble the gas-phase charged moiety." This result indicates that, even though the average values observed by Wang and Jardetzky (2002) contain charged and uncharged side chains, it is better to use uncharged than charged side chains, except for Asp, to compute the 13C
chemical shifts. Asp is the only ionizable group for which both a charged and an uncharged side chain lead to 
values that are not within the observed standard deviation, as can be seen in Figure 5
|
(and 13C
) chemical shifts computed for residues in the middle of the
-helix, for each of the 20 naturally occurring amino acids (using neutral residues) and the observed mean values of Wang and Jardetzky (2002) from a database containing >6100 amino acids. Similar correlation coefficient results were obtained for both 13C
and 13C
chemical shifts, respectively, that is, R = 0.97 (slope of 0.91 for the correlation line) and R = 0.99 (slope of 0.98 for the correlation line). The vertical lines in Figure 6A
|
chemical shifts; cf. 45 to 70 ppm observed in Figure 6A
chemical shifts) could mask the significance of the smaller deviations in Figure 6B
) between the 13C chemical shift computed for the central residue of the
-helix and the corresponding mean value observed by Wang and Jardetzky (2002). Figure 7A
chemical shift (as open bars) and Figure 7B
chemical shift. In these figures, the vertical lines denote the standard deviation of the mean value observed by Wang and Jardetzky (2002). Five out of 20, and 15 out of 19 residues exceed the observed standard deviation for the 13C
and 13C
chemical shift, respectively. This example clearly illustrates that the better agreement observed in Figure 6B
and 13C
chemical shifts, was only apparent.
|
chemical shifts, which exceed the standard deviation observed by Wang and Jardetzky (2002), namely, Asn, Asp, Ser, Thr, and Leu. These residues, together with Gly and Pro will be analyzed in detail below.
Asn, Asp, Ser, Thr, and Leu
There is evidence (Kotelchuck et al. 1969; Presta and Rose 1988; Richardson and Richardson 1988) indicating the existence of different preferences of the amino acids between the N and C termini and the interior positions in the
-helix. This effect has been noted for peptides (Nicholson et al. 1988; Fairman et al. 1989; Bruch et al. 1991; Lyu et al. 1992; Doig and Baldwin 1995) and proteins (Serrano and Fersht 1989; Serrano et al. 1992; Zhukovsky et al. 1994). In particular, N-terminal preferences for peptides and proteins agree very well among themselves (Chakrabartty et al. 1993; Doig and Baldwin 1995). Among all these studies, Richardson and Richardson (1988) pointed out that Asn, Asp, Ser, Thr, and Gly, which are helix breakers (Wojcik et al. 1990), exhibit stronger preferences than the rest of the naturally occurring amino acids to be at the N terminus in proteins. Conceivably, the mean values observed by Wang and Jardetzky (2002) should reflect these properties. To examine this possibility, we recomputed the values of 
by using the 13C
chemical shifts at the N terminus (shown in Fig. 7A
as black-filled bars), in place of those computed in the middle of the
-helix (shown in Fig. 7A
as open bars), for residues Asn, Ser, and Thr. The best agreement, in terms of 
, for each of these residues was found with the 13C
chemical shifts value computed at position 2 from the N terminus, except for Leucine.
The value of 
computed for leucine (1.3 ppm), for the central position of the
-helix, shown in Figure 7A
, is outside of the observed standard deviation (0.98 ppm). This disagreement cannot be explained by site position preferences. In fact, the value of 
computed from the N terminus gives a higher disagreement (2.9 ppm). To understand whether or not the source of such differences could be due to the side-chain orientation that is adopted by this residue, we carried out additional conformational searches that included five (out of six clustered) conformations (as explained in Materials and Methods). The results indicate that the computed 13C
chemical shift from the leading-member conformation of family number 3 (57.97 ppm) is closer to the observed one (57.54 ± 0.98 ppm) than the 13C
chemical shift computed for the lowest energy conformation of family number 1 (56.26 ppm). However, the total energy of the leading member of family number 3 is 2.3 kcal/mole higher in energy than the lowest energy conformation used to compute the value shown in Figure 7A
. Conceivably, side-chain interactions (not taken into account in our calculations) could stabilize this high-energy conformation and, hence, provide better agreement with the value observed by Wang and Jardetzky (2002).
The best agreement, in terms of 
, with respect to the observed mean value for Asp, was obtained with the 13C
chemical shifts computed at position 2 from the N terminus, by using the charged side chain (Asp). As already noted at the beginning of this subsection, an analysis for the N-terminal preferences in the
-helical conformation, observed for both peptides (Doig and Baldwin 1995) and proteins (Richardson and Richardson 1988), leads to a good correlation coefficient (R = 0.86). From their analysis, Asn, followed by Asp, are the amino acids residues with the highest propensity for the N-terminal position. This is a reflection of the tendency for their side chains to form a hydrogen-bonded ring with the backbone in a nonhelical backbone conformations (Lewis et al. 1973), and thereby serve as a helix breaker.
After considering the N-terminal preferences for Asn, Asp, Ser and Thr (shown as black-filled bars in Fig. 7A
), and the remaining amino acids (shown as open bars in the same figure), 19 out of 20 13C
chemical shifts are within the observed standard deviation.
Gly
Ananthanarayanan et al. (1971) noted that the low value of the Zimm-Bragg parameter s (Zimm and Bragg 1959) found for glycine, in the temperature range of 070°, provides a quantitative basis for classifying this residue as a helix breaker in proteins. Kotelchuck et al. (1969) suggested an N- to C-terminal direction of helix propagation, which implies that Gly possesses a stronger preference for the C-terminal position, in line with the prediction made that this residue is a helix-breaker (Lewis et al. 1970; Ananthanarayanan et al. 1971). Later, in agreement with this finding, Schellman (1980) noted that about one-third of all
-helices terminate with a Gly at the C terminus. Richardson and Richardson (1988) found that Gly possesses the sharpest preference among all amino acid residues for the C terminus than for the N terminus in proteins. Rules to explain this preference in proteins were proposed by Aurora et al. (1994). However, Doig and Baldwin (1995) found no preference for Gly at the C terminus in their experiments on
-helical peptides. Nevertheless, all the above evidence indicates that glycine shows a very low propensity for the middle of the
-helix.
In our theoretical calculations, good agreement in terms of 
(0.8 ppm) was obtained for the computed 13C
chemical shifts at position 5, that is, in the middle of the
-helix, as shown in Figure 7A
. Despite this good agreement, we checked whether the value of 
could be improved by considering the preference of Gly for the N- or the C-terminal position. Using the 13C
chemical shifts computed at the 1 to 9 positions, shown in Figure 2B
, we found that the best agreement in terms of 
, that is, the one that led to 
= 0.1 ppm in Figure 7A
(as a black bar), is found with the value of the 13C
chemical shifts (47.15 ppm) computed at position 1 (compared to the mean observed value of 47.02 ± 0.90 ppm). By contrast, the computed values of 
for 13C
chemical shifts at both the 8 and 9 positions gave values of 0.4 and 1.0 ppm, respectively. These results show that better agreement with the observed values, in terms of 
, is found for the values of the 13C
chemical shifts computed at the N terminus than those in the middle of the helix.
Pro
The computed value plotted in Figure 7A
for proline is in good agreement with the observed mean value of Wang and Jardetzky (2002). The lowest energy conformation, that is, with
= 133.18 and
= 78.85, from which this value was computed, shows that the alanine residue preceding proline lies in the D region of the map of Zimmermann et al. (1977), while proline itself is in the
-region of the Ramachandran map, with a trans peptide bond and the pyrrolidine ring in the up (U) conformation. This particular conformation seen for alanine is in agreement with the observation of MacArthur and Thornton (1991) that, for alanine preceding proline, the entire region 180 <
< 60 on the Ramachandran map is completely inaccessible, with
showing the characteristic preferences of the Ramachandran map.
For the lowest energy conformation identified in the EDMC conformational search, all alanines following pro-line displayed backbone dihedral angles in the
-region of the Ramachandran map. Hence, in our calculations, proline appears at the N-terminal position of the
-helix, in agreement with the observation of MacArthur and Thornton (1991), who noted that proline in the first turn is almost exclusively in the first position. Because of this observation, we initiated the calculations by placing proline only in the middle of the helix.
End effects
Our results show that modification of the 13C
chemical shifts to include the correction for the N-terminal preferences for Asn, Asp, Ser, Thr, and Gly residues, does not improve the correlation coefficient found in Figure 7A
significantly, viz., only from R = 0.97 to 0.98. However, the slope of the correlation line in Figure 8
shows a noteworthy change, viz., from 0.91 to 0.99 (which should be compared with the ideal value of 1.0). Comparison of Figure 7A
and Figure 7B
shows a greater disagreement for 13C
than for 13C
chemical shifts between computed and mean values observed by Wang and Jardetzky (2002). The reason for such behavior can be found in the analysis of the physical factors that influence the 13C
and 13C
chemical shifts (Wishart and Case 2001; Neal et al. 2003), such as ring current effects, torsional (
/
) effects, dihedral angles
s effects, etc.
|
-helical model peptide. In particular, we carried out a detailed comparison of the computed 13C chemical shifts for the
-helical model peptide with observed mean values and standard deviations obtained from an NMR database. From such a comparison, we conclude that (1) the good agreement seen between computed and observed 13C chemical shifts shows that the lowest energy conformation obtained from a gas-phase potential (represented by ECEPP/3) constitutes a good
-helical representation with which to model the observed mean value for the 13C chemical shift; (2) use of uncharged side chains, on average, gives better results when compared with observed 13C chemical shifts than charged residues do; (3) consideration of the N-terminal preference for some amino acids improves the agreement between computed and observed 13C
chemical-shift values; and (4) the position dependence of the computed 13C
chemical shifts, mainly at the N terminus, is a fingerprint that characterizes the 20 naturally occurring amino acids in the
-helical conformation. This effect appears to be a consequence of the asymmetry between ends and interior of
-helices, which in turn, is a consequence of the uneven distribution of the backbone hydrogen bonds. In particular, this site dependence may constitute valuable information that should be considered, among other important contributions: (1) for refinement of NMR-derived 3D models, and (2) for accurate prediction of secondary structure of proteins when, the 13C chemical-shift assignments and the sequence are the only available information. | Materials and methods |
|---|
|
|
|---|
-helix of nine residues seems to be a good model with which to study the position dependence of the 13C chemical shift because (1) it represents a tradeoff between short and long helices observed in proteins; (2) this length is in close agreement with a previous study of the average length of helices in proteins, which found that the distribution is broad with a mean length of ten residues (Barlow and Thorton 1988); and (3) quantum chemical calculations for a helix of nine residues is feasible in reasonable computational time by using accurate approaches (Vila et al. 2004), as explained later.
It should be noted that, in a terminally blocked
-helix, the initial three NH groups and the final three CO groups differ from the rest by not being able to participate in the complete intramolecular hydrogen bonding that is characteristic of the
-helix, as shown in Figure 1
. In this figure, it can also be seen that all the central five residues (numbers 37) are bracketed by three backbone hydrogen bonds. They thus differ from the N- and C-terminal residues that are bracketed by fewer hydrogen bonds. Based on these observations, we chose, as an
-helical model, peptides with 8; 0
i
8) the sequences Ac-(Ala)i-X-(Ala)j-NH2 (with i + j = and X as any of the 20 naturally occurring amino acids. Cysteine was studied only in the reduced (SH) form.
Alanine was selected as the dominant component of the model peptide because (1) among all the amino acids, this residue is one of the most frequently occurring ones in an
-helical conformation (Chou and Fasman 1974; Chen et al. 2003), and (2) its methyl side chain will not exhibit much interference with the side chain of the guest (X) residue.
Modeling the
-helical peptide and clustering analysis
The
-helical conformation for the sequence Ac-(Ala)9-NH2 was generated with the Electrostatically Driven Monte Carlo (EDMC) method (Ripoll and Scheraga 1988). An all-atom representation of the chain was used with the ECEPP/3 force field (Momany et al. 1975; Némethy et al. 1983, 1992; Sippl et al. 1984). Starting from a canonical
-helix (
= 60.0°;
= 40.0°), the lowest energy minimum conformation (24.69 kcal/mole) identified among 300 accepted conformations (by the Metropolis criterion) was adopted as the
-helix model conformation for this peptide. In this conformation, all the alanine residues are in the
-helical region of the Ramachandran map (Ramachandran et al. 1963), with
= 60.0° ± 10° and
= 40.0° ± 10°.
When Ala was replaced by any amino acid (X), except for proline (see later section), in the sequence: Ac-(Ala)i-X-(Ala)j- NH2 (with i + j = 8; 0
i
8), the backbone was kept fixed at the lowest energy minimum identified during the EDMC conformational search for the Ac-(Ala)9-NH2 peptide. Adoption of this conformation will ensure that the site-position preference of X is influenced only by the hydrogen-bonded backbone geometry, the charge characteristic of the
-helix, and the nature of X.
Only for leucine, a classification of the accepted conformations generated with the EDMC procedure was carried out with the clustering procedure used by Vila et al. (2002, 2003) to study statistical-coil peptides in solution, that is, through a minimal tree (the Minimal Spanning Tree [MST]) method (Kruskal 1956). The minimal tree was then partitioned in terms of a specified RMSD cutoff, leading to a defined number of families. The families resulting from the RMSD clustering procedure were ranked in increasing order according to their total free energy. For each family, both the number of conformations and the set of dihedral angles of the lowest energy member were stored. We refer to the lowest energy conformation of a family as the leading member.
Modeling the side-chain orientation
Polar or ionizable groups have a strong tendency to be exposed to the solvent, and hence, great mobility of their side chains is expected to occur. The side-chain position of nonpolar groups in an
-helix will be dominated by side-chainside-chain interactions, mainly in the interior of a protein. In modeling the
-helical conformation, no attention is paid to side-chainside-chain interactions or to side-chain mobility. It is not feasible to model both of these effects by quantum chemical calculations with the available computational resources. Therefore, the following procedure was used: For each guest residue X in the peptide Ac-(Ala)4-X-(Ala)4-NH2 (with X being any naturally occurring amino acid other than glycine and proline), we carried out an EDMC conformational search in which the only allowed variables were the dihedral angles
of residue X. During this conformational search, all the backbone dihedral angles (
,
) were fixed at the values obtained for the lowest energy conformation of the
-helix, obtained with the sequence Ac-(Ala)9-NH2, as described in the previous section. The dihedral angles
, from the lowest energy conformation, identified in this restricted EDMC conformational search for each guest residue X, were adopted as the representative side-chain orientation for each respective amino acid, independent of the backbone position that it occupied within the helical structure. The 13C chemical shifts computed in this manner, for each of the 20 naturally occurring amino acids in the middle of the
-helix, were compared with the observed mean values of Wang and Jardetzky (2002), as shown in Figure 6
, A and B.
In contrast to the adoption of the low-energy backbone conformation of Ac-(Ala)9-NH2 for 19 residues in the middle position of the
-helix, a different approach was taken for proline. For this residue, only in position X5 in the sequence Ac-(Ala)4-X5-(Ala)4-NH2, an EDMC conformational search was carried out taking into account (1) a canonical
-helical structure (
= 60.0°;
= 40.0°) as the starting conformation, (2) all backbone and side-chain dihedral angles of the whole chain considered as variables during the minimization procedure, (3) cis
trans isomerization of the peptide group for the proline residue, and (4) both up (U) and down (D) packering conformations of the pyrrolidine ring, which pertain to the (
= 53.0° and
1 = 28.1°) and (
= 68.8° and
1 = 27.4°) positions, respectively, of the C
atom of this residue. The 13C chemical shift was computed only for the lowest energy conformation identified in this conformational search for proline in the middle of the helix, that is, for the peptide Ac-(Ala)4-Pro-(Ala)4-NH2.
To understand why there was disagreement between the computed and observed 13C chemical shifts for leucine (as shown in the Results and Discussion section), a more detailed analysis of the side-chain conformational preferences was carried out. The 100 accepted side-chain conformations obtained with the EDMC conformational search in which the dihedral angles
of the leucine side chain were the only variables, were clustered at an RMSD of 0.2 Å, without a cutoff in energy. The leading member of the first five (out of six) families was used to compute the range of variation for the 13C chemical shifts, as a function of the side-chain orientation. The energy of the sixth family was too high to be of significance.
Modeling the position-dependence of the 13C chemical shifts
The position dependence of the 13C chemical shifts for a given guest residue X, other than proline, was computed by starting with X at the N terminus of the
-helix, and shifting it by one residue at a time up to the C terminus; that is, we computed the 13C chemical shift for the guest residue X in the sequence Ac-(Ala)i-X-(Ala)j-NH2 (with i + j = 8) for 0
i
8. The backbone dihedral angles (
,
) were always fixed at the values found for the
-helix model conformation of poly (L-alanine) defined earlier, and the dihedral angles (
) were those found for the lowest energy conformation of the guest residue X in the middle of the helix, as explained in the previous section. Following this procedure, that is, fixing the dihedral angles
,
, and
, we determined the changes that occur in the 13C chemical shift only as a consequence of the change in position along the
-helix.
The proline residue was studied only in the middle of the helix because (1) there is experimental evidence showing that proline can be classified as a helix breaker (Lewis et al. 1970; Altmann et al. 1990) and (2) data from the PDB indicate that proline residues exhibit a preference to be either in the first turn of an
-helix or after the C terminus (MacArthur and Thornton 1991).
Quantum-chemical calculations of the 13C chemical shift
It is possible to obtain theoretical shielding values of good quality by using large basis sets located only on the atoms whose shifts are of interest while the rest of the atoms in the molecule are treated with more modest basis sets (Chesnut and Moore 1989). This is called the locally dense basis approach, and its use enables us to minimize the length of the chemical-shift calculations while maintaining the accuracy of the results (Laws et al. 1995; Vila et al. 2003 Vila et al. 2004). Using this approach, we were able to show (Vila et al. 2004) that a fairly small increase in the quality of the 13C chemical shift, by treating sets of three, five or seven consecutive residues, instead of one, with a locally dense basis set did not justify the large decrease (by more than 17 times) in speedup. Based on this observation, in this work we decided to compute the 13C chemical shifts by treating a single residue, that is, the guest residue X, with a locally dense [6311+G(2d,p)] basis set, while the rest of the molecule is treated with the simpler 321G basis set. This notation refers to the basic basis sets of Pople and colleagues (Hehre et al. 1986) as implemented in Gaussian-98 (Frisch et al. 1998). All the calculated isotropic shielding values (
) were referenced with respect to a tetramethylsilane (TMS) 13C chemical shift scale (
), as described previously (Vila et al. 2002).
| Footnotes |
|---|
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
Ananthanarayanan, V.S., Andreatta, R.H., Poland, D., and Scheraga, H.A. 1971. Helix-coil stability constants for the naturally occurring amino acids in water. III. Glycine parameters from random poly(hydroxybutylglutamine-co-glycine). Macromolecules 4: 417424.[CrossRef]
Aurora, R., Srinivasan, R., and Rose, G. 1994. Rules for
-helix termination by glycine. Science 264: 11261130.
Barlow, D.J. and Thornton, J.M. 1988. Helix geometry in proteins. J. Mol. Biol. 201: 601619.[CrossRef][Medline]
Bruch, M.D., Dhingra, M.M., and Gierasch, L.M. 1991. Side chain-backbone hydrogen bonding contributes to helix stability in peptides derived from an
-helical region of carboxypeptidase A. Proteins 10: 130139.[CrossRef][Medline]
Chakrabartty, A., Doig, A.J., and Baldwin, R.L. 1993. Helix capping propensities in peptides parallel those in proteins. Proc. Natl. Acad. Sci. 90: 1133211336.
Chen, H., Zhou, X., and Ou-Yang, Z.-C. 2003. Classification of amino acids based on statistical results of known structures and cooperativity of protein folding. Phys. Rev. E 65: 061907.[CrossRef]
Chesnut, D.B. and Moore, K.D. 1989. Locally dense basis-sets for chemical-shift calculations. J. Comp. Chem. 10: 648659.[CrossRef]
Chou, P.Y. and Fasman, G.D. 1974. Conformational parameters for amino-acids in helical,
-sheet, and random coil regions calculated from proteins. Biochemistry 13: 211222.[CrossRef][Medline]
Doig, A.J. and Baldwin, R.L. 1995. N- and C-capping preferences for all 20 amino acids in
-helical peptides. Protein Sci. 4: 13251336.[Abstract]
Fairman, R., Shoemaker, K.R., Stewart, J.M., and Baldwin, R.L. 1989. Further studies of the helix dipole model: Effects of a free
-NH+ or
-COO group on helix stability. Proteins 5: 17.[CrossRef][Medline]
Frisch, M.J., Trucks, G.W., Schlegel, H.B., Scuseria, G.E., Robb, M.A., Cheeseman, J.R., Zakrzewski, V.G., Montgomery Jr., J.A., Stratmann, R.E., Burant, J.C., et al. 1998. Gaussian 98, Revision A.7. Gaussian Inc., Pittsburgh, PA.
Hehre, W.J., Radom, L., Schleyer, P., and Pople, J.A. 1986. Ab initio molecular orbital theory. John Wiley and Sons, New York.
Iwadate, M., Asakura, T. and Williamson, M.P. 1999. C
and C
carbon-13 chemical shifts in proteins from an empirical database. J. Biomol. NMR 13: 199211.[CrossRef][Medline]
Kotelchuck, D., Dygert, M., and Scheraga, H.A. 1969. The influence of short-range interactions on protein conformation. III. Dipeptide distributions in proteins of known sequence and structure. Proc. Natl. Acad. Sci. 63: 615622.
Kruskal Jr., J.B. 1956. On the shortest spanning sub tree of a graph and the traveling salesman problem. Proc. Am. Math. Soc. 7: 4850.[CrossRef]
Kuszewski, J., Qin, J.A., Gronenborn, A.M., and Clore, G.M. 1995. The impact of direct refinement against C-13(
) and C-13(
) chemical-shifts on protein-structure determination by NMR. J. Magn. Reson. Ser. B 106: 9296.[CrossRef][Medline]
Laws, D.D., Le, H., de Dios, A.C., Havlin, R.H., and Oldfield, E. 1995. A basis size dependence study of Carbon-13 nuclear magnetic resonance spectroscopic shielding in Alanyl and Valyl fragments: Toward protein shielding hypersurfaces. J. Am. Chem. Soc. 117: 95429546.[CrossRef]
Lecomte, J.T.J. and Moore, C.D. 1991. Helix formation in apocytochrome b5: The role of a neutral histidine at the N-cap position. J. Am Chem. Soc. 113: 96639665.[CrossRef]
Lewis, P.N., G
, N., G
, M., Kotelchuck, D., and Scheraga, H.A. 1970. Helix probability profiles of denaturated proteins and their correlation with native structures. Proc. Natl. Acad. Sci. 65: 810815.
Lewis, P.N., Momany, F.A., and Scheraga, H.A. 1973. Energy parameters in polypeptides. VI. Conformational energy analysis of the N-acetyl N'-methyl amides of the twenty naturally occurring amino acids. Israel J. Chem. 11: 121152.
Lyu, P.C., Zhou H.X., Jelveh, N., Wemmer, D.E., and Kallenbach, N.R. 1992. Position-dependent stabilizing effects in
-helices-N-terminal capping in synthetic model peptides. J. Am. Chem. Soc. 114: 65606562.[CrossRef]
MacArthur, M.W. and Thornton, J.M. 1991. Influence of proline residues on protein conformation. J. Mol. Biol. 218: 397412.[CrossRef][Medline]
Momany, F.A., McGuire, R.F., Burgess, A.W., and Scheraga, H.A. 1975. Energy parameters in polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occurring amino acids. J. Phys. Chem. 79: 23612381.[CrossRef]
Navon, A., Ittah, V., Landsman, P., Scheraga, H.A., and Haas, E. 2001. Distributions of intramolecular distances in the reduced and denatured states of bovine pancreatic ribonuclease A. Folding initiation structures in the C-terminal portions of the reduced protein. Biochemistry 40: 105118.[CrossRef][Medline]
Neal, S., Nip, A.M., Zhang, H., and Wishart, D.S. 2003. Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts. J. Biomed. NMR 26: 215240.
Némethy, G., Pottle, M.S., and Scheraga, H.A. 1983. Energy parameters in polypeptides. 9. Updating of geometrical parameters, nonbonded interactions, and hydrogen bond interactions for the naturally occurring amino acids. J. Phys. Chem. 87: 18831887.[CrossRef]
Némethy, G., Gibson, K.D., Palmer, K.A., Yoon, C.N., Paterlini, G., Zagari, A., Rumsey, S., and Scheraga, H.A. 1992. Energy parameters in polypeptides. 10. Improved geometrical parameters and nonbonded interactions for use in the ECEPP/3 algorithm, with application to proline-containing peptides. J. Phys. Chem. 96: 64726484.[CrossRef]
Nicholson, H., Becktel, W.J., and Matthews, B.W. 1988. Enhanced protein thermo-stability from designed mutations that interact with
-helix dipoles. Nature 336: 651656.[CrossRef][Medline]
Oldfield, E. 2002. Chemical shifts in amino acids, peptides, and proteins: From quantum chemistry to drug design. Annu. Rev. Phys. Chem. 53: 349378.[CrossRef][Medline]
Penel, S., Morrison, R.G., Mortishire-Smith, R.J., and Doig, A.J. 1999. Periodicity in
-helix lengths and C-capping preferences. J. Mol. Biol. 293: 12111219.[CrossRef][Medline]
Poland, D. and Scheraga, H.A. 1970. Theory of helix-coil transitions in biopolymers. Academic Press, New York.
Presta, L.G. and Rose, G.D. 1988. Helix signal in proteins. Science 240: 16321641.
Ramachandran, G.N., Ramakrishnan, C., and Sasisekharan, V. 1963. Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7: 9599.