|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 NEC Laboratories America, Princeton, New Jersey 08540, USA
2 Institut de Physique Théorique, Université de Lausanne, 1015 Lausanne, Switzerland
3 Center for Studies in Physics and Biology, Rockefeller University, New York, New York 10021, USA
Reprint requests to: Chao Tang, NEC Laboratories America, Princeton, NJ 08540, USA; e-mail: tang{at}nec-labs.com; fax: (609) 951-2483.
(RECEIVED September 10, 2003; FINAL REVISION November 26, 2003; ACCEPTED November 28, 2003)
| Abstract |
|---|
|
|
|---|
Keywords: hydrophobicity; protein folding; surface exposure; secondary structure; designability
Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.03431704.
| Introduction |
|---|
|
|
|---|
Of the many forces involved, it is argued that the hydrophobic interaction plays a central role in determining the overall fold of a protein (Kauzmann 1959; Tanford 1978). Each of the 20 amino acids has a characteristic hydrophobicitya measure of the nonpolarity (insolubility in water) of a molecule. On average, hydrophobic residues tend to be in the core of a protein, where solvent accessibility is low, whereas polar residues tend to reside on the surface, where solvent accessibility is high (Rose et al. 1985; Miller et al. 1987; Lesser and Rose 1990; Lins et al. 2003). Many attempts based on different approaches have been made to determine the hydrophobicity of the amino acids (Nozaki and Tanford 1971; Kyte and Doolittle 1982; Engelman et al. 1986; Nauchitel and Somorjai 1994; Miyazawa and Jernigan 1996, 1999; DeVido et al. 1998; Branden and Tooze 1999). However, the various scales in the literature sometimes disagree as to these hydrophobicity rankings (Nauchitel and Somorjai 1994), which has been attributed to the fact that hydrophobicity is a relative quantity that depends on the environment and reference molecules used in the measurement (DeVido et al. 1998). Empirical hydrophobicity measurements may not truly reflect the energetics of solvation in protein folding (Lee 1993). Statistical scales may better reflect the role of solvation in folding.
Although on average there is a correlation between hydrophobicity and surface exposure (Chothia 1974; Rose et al. 1985; Miller et al. 1987), the extent to which a fold of a protein, and hence its specific surface-exposure pattern, correlates with the hydrophobic pattern dictated by its amino acid sequence remains unclear. If the average hydrophobic behavior of amino acids is generally true, one might expect that there should be a statistically significant correlation between the hydrophobicity sequence and the corresponding surface-exposure pattern. However, theoretical studies of protein folding using only hydrophobicity models (Dill 1985; Lau and Dill 1989) have shown that there can be significant variations among hydrophobicpolar sequences that adopt a given structure (Li et al. 1998). This translates into the theoretical structures having a large degree of mutational stability (Li et al. 1996). Do real proteins also display this behavior? Quantifying the degree of variation between sequence and structure will be relevant to protein design based purely on hydrophobicpolar (HP) patterning, in which the hydrophobicity sequence is assumed to dictate the final fold (Kamtekar et al. 1993).
In this article, we analyze on a structure-to-structure basis the correlation between hydrophobicity sequence and surface-exposure pattern for several commonly used hydrophobicity scales. We find that all the scales yield similar distributions of correlation coefficients, and that these distributions are statistically significant when compared with a null model in which the amino acid sequences are randomized. However the distributions are broad, and the means are far from the fully correlated limit. We explore various factors that influence this less-than-optimal correlation between sequence and surface-exposure pattern. This encompasses looking at how the degree of mutational stability (i.e., sequence entropy/designability) affects the correlation, along with other lesser effects such as the actual surface-exposure propensities of the amino acids and secondary-structural influences. We show that the less-than-optimal correlation between sequence and structure for naturally occurring proteins is a manifestation of designability, and may also be selected for to "design out" competing folds.
| Results |
|---|
|
|
|---|
-helical amino acid side chains from water to a nonaqueous environment (Engelman et al. 1986), determination of transfer free energies by measuring solubilities in water and ethanol relative to the reference amino acid glycine (Nozaki and Tanford 1971), calculating residueresidue potentials with pairwise contact energies (Miyazawa and Jernigan 1996), and a refined study of the latter using the Bethe approximation for determination of relative contact energies with respect to the native state (Miyazawa and Jernigan 1999). These scales cover a broad range of methods used to characterize hydrophobicity, ranging from empirical to statistical approaches.
Figure 1
shows the distributions of computed correlations between the hydrophobicity sequences and surface-exposure patterns of the 3242 structures in our data set using the above scales. The black histograms were computed using all the amino acids. None of the means exceed 0.5, with the highest being µdata =
cS
database = 0.454 for the scale in Miyazawa and Jernigan 1999. Nevertheless, the computed distributions are significantly different from the null model, which considers the same set of structures but uses randomized versions of their amino acid sequences. (For each representative structure, we computed the correlation coefficient between its surface-exposure pattern and 25 random versions of its hydrophobicity sequence.) The distribution of correlation coefficients computed for the null model is shown in blue for each scale. Despite several discrepancies in classification between the scales, it can be seen that all yield similar distributions of correlation coefficients and that all have similar scores Z = (µdata - µnull)/
null when compared with their null models, with values between Z = 2.46 and Z = 2.91 (see Table 1
).
|
|
|
Lastly, we consider the improvements to the correlation between sequence and structure if only residues that form secondary-structural elements are used. Many helices and strands have one side that is hydrophobic and hence tends to be in the core, whereas the other side is polar and tends to be exposed on the surface. Turns tend to be flexible and irregular. Including turns may increase the noise in the data. Figure 2C
shows that a slight improvement is gained by only considering helices and strands. We further break down the connection to secondary-structural elements and surface exposure for the various amino acids below.
Surface-exposure distributions of the amino acids
As shown above, the known hydrophobicity scales yield statistically significant correlations between a proteins pattern of surface exposure and the hydrophobicities of its amino acid sequence. However, despite this statistical significance, the correlations are far from the case in which hydrophobicity and exposure patterns are completely correlated. In this section, we show that this departure from optimal correlation can be partly attributed to the broad distribution of surface exposures that some amino acids tolerate. In the spirit of the work by Rose et al. (1985), for each amino acid we have computed its surface-exposure distribution within the representative set of structures. From the distributions we derive a surface-exposure propensity that reflects the tendency of each amino acid to be either exposed or buried in the core, and show that this scale leads to a better correlation between sequence and surface pattern.
Before considering the surface-exposure distributions of each amino acid, we examine the probability distributions for surface exposure and amino acid occurrence within the database of structures. Folded proteins are dense, three-dimensional (3D) clusters of amino acids. The core thus represents a considerable portion of the whole protein, whereas only a relatively small number of amino acids are to some extent exposed to the aqueous solvent. In Figure 3A
we show the probability p(A) for a given surface exposure A using all of the side-chain exposures from the 3242 representative structures. It is clear that a large fraction of residues reside in the core, where surface exposure is low. The probability of occurrence for the individual amino acids, p(a.a.), is also nonuniform. Figure 3B
shows the occurrence frequencies of the amino acids within the sequences used in the data set. These distributions will be used to examine whether the occurrence of an amino acid with a given surface exposure is correlated or independent.
|
![]() | (1) |
where values >1 indicate favored for the given surface exposure, whereas those <1 are less favored.
Figures 4
6
show the distributions of P for the 20 amino acids. The distributions are rather broad. Tests using only a half of the database, and others using only a half of the length of the sequences, led to very similar results. As was found by Rose et al. (1985), our distributions are also suggestive of three classes of amino acids: core amino acids (C) with a peak at low surface exposure, surface amino acids (S) with a peak at high surface exposure, and intermediate amino acids (M) with relatively flat distributions. We are in agreement with Rose et al. regarding core amino acids; however, there are discrepancies between our classification of intermediate and surface amino acids. Nominally some of our intermediate amino acids show preferences for being on the surface when only secondary structure is consideredthis is discussed below.
|
|
|
|
Secondary-structure analysis
The native configuration of a folded protein is characterized by secondary-structure elements,
-helices and
-strands, which are connected by turns (Levitt and Chothia 1976). It was shown above that considering only the sequence and surface patterns of secondary structural elements led to a slight improvement in the correlation between hydrophobicity and exposure. In this section, we break down the occurrence of the 20 amino acids in these structural elements and their corresponding surface-exposure patterns. We first consider the distribution of surface exposures within secondary elements irrespective of amino acid: Figure 3A
shows that most of the residues in
-helices and
-strands occur in the interior of native protein configurations. However, this effect is much stronger for
-strands indicating that residues making up
-strands have a higher tendency to be in the core than those making up helices.
It is well known that the various amino acids have different propensities to form either
-helices or
-strands (Munoz and Serrano 1994). Figure 3B
shows the frequency of occurrence of each amino acid in
-helices and
-strands compared with the frequency of occurrence over the whole database. The amino acids are arranged according to their ASA values in increasing order. Compared with the total database,
-strands tend to be composed of a high portion of amino acids with low ASA and rather large side chains, such as V, I, and T, or with an aromatic ring as in F, Y, and W, whereas charged amino acids occur less frequently than expected. For
-helices, strong helix-formers such as alanine are particularly prominent, and the residues that are found more frequently in other parts of the proteins are divided into comparable numbers of amino acids with low and high ASA.
Figures 4
6
show the surface-exposure distributions P of the 20 amino acids in
-helices and in
-strands, juxtaposed with the distributions for the entire database. For the core (C) amino acids, the differences are rather small. However, for the intermediate (M) amino acids, both arginine and asparagine (which are nominally polar) appear prominently as being exposed in
-strands. Arginine is also seen to have a tendency to appear on the exposed surfaces of helices. For those nominally polar amino acids (S) classified as residing on the surface, the propensity to be exposed is further increased within secondary structures when compared with the results obtained from the whole database. These slight enhancements in surface-exposure propensity for certain amino acids while in secondary-structural elements leads to the marginal improvement in correlation between sequence and surface exposure seen above when only secondary elements were included.
Model
Theoretically, hydrophobicpolar (HP) models have been studied for some time to help clarify the nature of the hydrophobic force in the folding process. Correlations have been studied in the context of sequence (White and Jacobs 1990), and nonrandomness has been detected both in real protein sequences and theoretical models (Irbäck et al. 1996; Irbäck and Sandelin 2000). Here, we consider the correlations between hydrophobicity sequences and surface-exposure patterns that emerge in a protein-folding model based solely on hydrophobicity. Does the less than perfect correlation between hydrophobicity sequence and surface pattern still remain when only solvation energy is considered? If so, is it caused by the large variation of sequences that can be tolerated by highly designable structures (Li et al. 1996)? How does averaging improve the correlation in the model results?
We study the folding of random amino acid sequences using an HP model (see Materials and Methods), in which the single energy entering the analysis is a solvation energy dependent only on the hydrophobicities of the side chains and their corresponding surface exposures in a fold. Because it is not computationally feasible to consider the continuum of possible structures that a large set of random sequences could adopt, we choose to use only a finite number of compact representative folds, formed in this case by a statistically complete set of four-helix bundles. The designability of this set of structures has been studied previously, and many of the top designable helix structures in this set correspond to naturally occurring four-helix bundles (Emberly et al. 2002). The set has the following advantages: (1) The folds are 60-mers and hence are much longer than structures generated by enumerating all possible structures using a finite set of dihedral angles (Miller et al. 2002); (2) it is more diverse than decoy sets generated from a specific native fold. A set of random amino acid sequences was folded onto the above set of structures using the HP model (see Materials and Methods). We chose the top 250 designable structures and their corresponding sequences to form the database on which to perform the correlation analysis. These structures represent plausibly thermodynamically stable folds and their corresponding sequences, although just a mere sample of the sequences that actually fold into these structures are assumed to be good folders. Lattice studies have shown that removing the compactness constraint can lead to a different set of designable structures (Chan and Bornberg-Bauer 2002), but the correlation findings below undoubtedly would not change.
Figure 8
shows the distribution of correlation coefficients between the hydrophobicity sequences and surface-exposure patterns of the model. The green histogram was computed using only a single sequence, randomly selected from the pool that fold to the corresponding structure, for each structure. This is nearly identical to what was found from the database, namely, that the correlation between a hydrophobicity sequence and its structure is less than optimal. The red histogram is for a randomized version of the data. Thus, as before the correlation between sequence and structure is not random and has some statistical significance. Because for each of the 250 designable structures we have several hundred sequences that fold into them, we can assess the effects of sampling. As in the analysis for the real protein structures, the mean hydrophobicity sequence was computed for each set of sequences that adopt the same fold. Although the mean is somewhat greater than those of the database distributions, the model distribution remains similar to the results computed from the database structures and sequences. Reducing the number of sequences used to compute the average (10%) still leads to an improvement in the correlation and is more in line with the improvement seen in the database analysis. We discuss the implications of the theoretical findings in light of the database results below.
|
| Discussion |
|---|
|
|
|---|
The origin of this suboptimal correlation may lie in the fact that there are factors other than hydrophobicity that contribute to the determination of a proteins final fold. There are clearly other forces at work in determining a proteins ultimate fold, for example, a recent study suggested that hydrophobicity alone cannot account for the observed thermodynamics of protein folding (Chan 2000). Thus, some residues behavior may not be solely dictated by hydrophobicity. Using updated data, we carried out an analysis similar to Rose et al. (1985) to determine the surface-exposure distributions of each of the amino acids, and found that many were rather broad. Indeed, several amino acids have essentially flat distributions, and hence their exposure seems to be uncorrelated with their hydrophobicity. Such broad distributions are in part responsible for the less than optimal correlation, and we showed that using only a subset of amino acids that have more peaked distributions led to improved correlations. The exposure distributions reflect all of the forces that are involved in the folding process, and we have found several discrepancies between the most likely exposure of an individual amino acid and its hydrophobicity. An example is provided by cysteine, for which the ability to form disulfide bonds with other cysteine residues constitutes a factor independent of hydrophobicity that influences surface exposure. From the distributions we computed a scale that reflects the surface-exposure propensities of the amino acids. This goes beyond just hydrophobicity and leads to an improvement in the correlation between sequence and the surface-exposure pattern of a fold. Hence, for folding studies that use energy models that are based solely on side-chain solvation, using these database-derived distributions (or the ASAs computed from them) over the empirical hydrophobicity scales should lead to a better performance.
By far the greatest improvement was achieved when we computed the correlation coefficients between average hydrophobicity sequences and structures. The average hydrophobicity sequence gives a better measure of the sequence that best matches the structure (Finkelstein 1998). The low correlation observed at the single-sequence level shows that there can be a broad variation from that of the "best match" sequence. From theoretical models, it is predicted that thermodynamically stable folds are those that are also highly designable; that is, they have a large number of sequences that fold into them (Li et al. 1996; Emberly et al. 2002; Miller et al. 2002). This large degree of mutational stability for designable folds means that there can be significant departures from the lowest energy sequence. In fact, if sequences were selected at random from a large pool of sequences that fold into a designable structure, it would be more likely to select a sequence far from the central "best match" sequence than not. Even if a sequence started near the "center" (best match sequence), its "neutral" evolution would lead it to somewhere farther away from the center in the sequence space owing to the sequence entropy (Li et al. 1998; Taverna and Goldstein 2002a). Hence, the lack of strong correlation between sequence and structure found in the database could be a signature of designability in nature. It has also been postulated that it may even be advantageous for sequences to select against being near the "best match," as such selection helps to improve plasticity in sequence space (Taverna and Goldstein 2002b).
We have shown that the correlation improves when one uses the average hydrophobicity sequence; however, we have also found that even the average sequence is not perfectly correlated with the surface-exposure pattern. This could simply be because of insufficient sampling of sequence space or could be evidence of something more fundamental. It has been argued that having a suboptimal correlation between a proteins amino acid sequence and surface-exposure pattern may help to improve the thermodynamic stability of the fold and "design out" competing folds (DeGrado 1997). All of the average hydrophobicity patterns for the most designable model helix structures have "misspellings" at various locations, where a misspelling involves the placement of a hydrophobic residue on an exposed site or a polar residue in the core. These departures from the optimal pairing of hydrophobicity with exposure have been shown in other theoretical studies (Emberly et al. 2002) to help increase the energy gap between the ground state and competing structures. If the hypothesis of designing out competing structures through suboptimal correlation is valid, this has important consequences for structural design based on binary patterning (Kamtekar et al. 1993). The surface pattern of the structure may act as a starting point for the selection of an amino acid sequence, but it may then prove advantageous to depart from this blueprint to improve thermodynamic stability. Database analysis of the type performed here may form the basis for advanced techniques to detect further correlations between sequence and structure that would help to better design sequences in protein design.
| Materials and methods |
|---|
|
|
|---|
Correlation analysis
A hydrophobicity scale s assigns a hydrophobicity value ha.a.s to each amino acid (a.a.). hi,js is the hydrophobicity of the i-th aligned residue of sequence j that is aligned with a representative structure, based on the hydrophobicity scale s. For the set of amino acid sequences that fold into a given structure, we wish to consider what the average hydrophobicity sequence for the set is. We consider the average sequence because it gives a good characterization of the hydrophobicity sequence that adopts the given representative structure (Finkelstein 1998; Cui and Wong 2000). The average hydrophobicity value
at position i within this representative structure using scale s is:
![]() | (2) |
where M is the number of sequences in the alignment at residue i. Calculating this average for all residues of the representative structure with length N gives the average hydrophobicity sequence of this structure:
.
The surface exposure ai of residue i in a structure is quantified as the amount of surface area of the side chain atoms (represented as spheres) that is accessible to water (represented by a sphere of radius 1.4 Å). For each structure, we obtain the surface exposures of each of its residues from the FSSP file. We normalize each surface exposure by the total surface area of the side-chain atoms making up the given residue (Creighton 1993). This yields a fractional exposure for each residue in a structure. We compute an average surface-exposure pattern for a structure using its FSSP alignment:
![]() | (3) |
where L is the number of known structures that have a residue aligned with residue i and ai,j denotes the surface-accessible area of residue i in structure j of the alignment. Performing this procedure for each residue i of the representative structure leads to a sequence of surface accessibilities
.
The correlation coefficient cs between the hydrophobicity sequence
and the accessible surface-area sequence
of a structure is given by:
![]() | (4) |
Hydrophobicpolar model
In hydrophobicpolar (HP) models, hydrophobicity is the sole force driving the folding process (Dill 1985; Lau and Dill 1989). For an amino acid sequence that corresponds to a sequence of hydrophobicities {hi}, the solvation energy of the sequence on a given structure
is
![]() | (5) |
where ai
is the surface exposure of residue i in structure
. The native fold of a sequence is the one that minimizes this energy.
We use a representative set of structures to act as the space of potential folds. For a given amino acid sequence, we then use the above energy equation to determine the structure that has the lowest energy within the set of competing structures. We deem this to be the native fold of the sequence. Studies have shown that folding numerous random amino acid sequences in this way results in a nonuniform mapping of sequences to structures: Some structures turn out to be native folds far more often than others, and have been designated "designable" structures (Li et al. 1996).
Here we consider a representative set of 203,282 four-helix bundles for the competing set of structures (Emberly et al. 2002). This set was shown to cover the space of all possible four-helix folds at the 95% confidence level, and hence represents a relatively complete set of compact folds on which an HP sequence can compete. Then 106 random amino acid sequences (the hydrophobicity scale based on transfer free energy between water and ethanol was used; Nozaki and Tanford 1971) were folded by selecting the ground-state structure for each sequence. The top 250 designable structures (each with several hundred sequences that fold into it) and their corresponding hydrophobicity sequences formed the model database on which the correlation analysis was performed.
|
| Acknowledgments |
|---|
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| References |
|---|
|
|
|---|
Branden, C. and Tooze, J. 1999. Introduction to protein structure, pp. 67. Garland Publishing, New York.
Chan, H.S. 2000. Modeling protein density of states: Additive hydrophobic effects are insufficient for calorimetric two-state cooperativity. Proteins 40: 543571.[CrossRef][Medline]
Chan, H.S. and Bornberg-Bauer, E. 2002. Perspectives on protein evolution from simple exact models. App. Bioinformatics 1: 121144.
Chothia, C. 1974. Hydrophobic bonding and accessible surface area in proteins. Nature 248: 338339.[CrossRef][Medline]
. 1992. One thousand families for the molecular biologist. Nature 357: 543544.[CrossRef][Medline]
Creighton, T.E. 1993. Hydrophobicity scale (p. 154). Surface accessibilities of amino acids (p. 142). In Proteins: Structures and molecular principles, 2nd ed. W.H. Freeman, New York.
Cui, Y. and Wong, W.H. 2000. Multiple-sequence information provides protection against mis-specified potential energy functions in the lattice model of proteins. Phys. Rev. Lett. 85: 52425245.[CrossRef][Medline]
DeGrado, W.F. 1997. Proteins from scratch. Science 278: 8081.
DeVido, D.R., Dorsey, J.D., Chan, H.S., and Dill, K.A. 1998. Oil/water portioning has a different thermodynamic signature when the oil solvent chains are aligned than when they are amorphous. J. Phys. Chem. 102: 72727279.
Dill, K.A. 1985. Theory for the folding and stability of globular proteins. Biochemistry 24: 15011509.[CrossRef][Medline]
Emberly, E.G., Wingreen, N.S., and Tang, C. 2002. Designability of
-helical proteins. Proc. Natl. Acad. Sci. 99: 1116311168.
Engelman, D.M., Steitz, T.A., and Goldman A. 1986. Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Annu. Rev. Biophys. Biomol. Struct. 15: 321353.[CrossRef]
Finkelstein, A.V. 1998. 3D protein folds: Homologs against errorsA simple estimate based on the random energy model. Phys. Rev. Lett. 80: 48234825.[CrossRef]
Godzik, A., Kolinski, A., and Skolnick, J. 1995. Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets. Protein Sci. 4: 21072117.[Abstract]
Holm, L. and Sander, C. 1996. Mapping the protein universe. Science 273: 595602.
Irbäck, A. and Sandelin, E. 2000. On hydrophobicity correlations in protein chains. Biophys. J. 79: 22522258.
Irbäck, A., Peterson, C., and Potthast, F. 1996. Evidence for nonrandom hydrophobicity structures in protein chains. Proc. Natl. Acad. Sci. 93: 95339538.
Kamtekar, S., Schiffer, J.M., Xiong, H., Babik, J.M., and Hecht, M.H. 1993. Protein design by binary patterning of polar and nonpolar amino acids. Science 262: 16801685.
Kauzmann, W. 1959. Some factors in the interpretation of protein denaturation. Adv. Protein Chem. 14: 163.[Medline]
Kyte, J., and Doolittle, R.F. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157: 105132.[CrossRef][Medline]
Lau, K.F. and Dill, K.A. 1989. A lattice statistical mechanics model of the conformational and sequence spaces of proteins. Macromolecules 22: 39863997.[CrossRef]
Lee, B. 1993. Estimation of the maximum change in stability of globular proteins upon mutation of a hydrophobic residue to another of smaller size. Protein Sci. 2: 733738.[Abstract]
Lesser, G.J. and Rose, G.D. 1990. Hydrophobicity of amino acid subgroups in proteins. Proteins 8: 613.[CrossRef][Medline]
Levitt, M. and Chothia, C. 1976. Structural patterns in globular proteins. Nature 261: 552558.[CrossRef][Medline]
Li, H., Helling, R., Tang, C., and Wingreen, N. 1996. Emergence of preferred structures in a simple model of protein folding. Science 273: 666669.[Abstract]
Li, H., Tang, C., and Wingreen, N. 1998. Are protein folds atypical? Proc. Natl. Acad. Sci. 95: 49874990.
Lins, L., Thomas, A., and Brasseur, R. 2003. Analysis of accessible surface of residues in proteins. Protein Sci. 12: 14061417.
Miller, S., Janin, J., Lesk, A.M., and Chothia, C. 1987. Interior and surface of monomeric proteins. J. Mol. Biol. 196: 641656.[CrossRef][Medline]
Miller, J., Zeng, C., Wingreen, N., and Tang, C. 2002. Emergence of highly designable protein-backbone conformations in an off-lattice model. Proteins 47: 506512.[CrossRef][Medline]
Miyazawa, S. and Jernigan, R.L. 1996. Residueresidue potentials with a favorable contact pair term and an unfavorable high packing term, for simulation and threading. J. Mol. Biol. 256: 623644.[CrossRef][Medline]
. 1999. Self-consistent estimation of inter-residue protein contact energies based on an equilibrium mixture approximation of residues. Proteins: Struct. Mol. Principles 34: 4968.
Munoz, V. and Serrano, L. 1994. Intrinsic secondary structure propensities of the amino acids, using statistical
matrices: Comparison with experimental scales. Proteins 20: 301311.[CrossRef][Medline]
Murzin, A.G., Brenner, S.E., Hubbard, T., and Chothia, C. 1995. SCOP: A structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247: 536540.[CrossRef][Medline]
Nauchitel, V.V. and Somorjai, R.L. 1994. Spatial and free energy distribution patterns of amino acid residues in water soluble proteins. Biophys. Chem. 51: 327336.[CrossRef][Medline]
Nozaki, Y. and Tanford, C. 1971. Solubility of amino acids and 2 glycine peptides in aqueous ethanol and dioxane solutionsEstablishment of a hydrophobicity scale. J. Biol. Chem. 246: 2211.
Rose, G., Geselowitz, A., Lesser, G., Lee, R., and Zehfus, M. 1985. Hydrophobicity of amino acid residues in globular proteins. Science 289: 834839.
Tanford, C. 1978. Hydrophobic effect and organization of living matter. Science 200: 10121018.
Taverna, D. and Goldstein, R.A. 2002a. Why are proteins marginally stable? Proteins 46: 105109.[CrossRef][Medline]
. 2002b. Why are proteins so robust to site mutations? J. Mol. Biol. 315: 479484.[CrossRef][Medline]
White, S.H. and Jacobs, R.E. 1990. Statistical distribution of hydrophobic residues along the length of protein chains. Implications for protein folding and evolution. Biophys. J. 57: 911921.
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
R. Wroe, E. Bornberg-Bauer, and H. S. Chan Comparing Folding Codes in Simple Heteropolymer Models of Protein Evolutionary Landscape: Robustness of the Superfunnel Paradigm Biophys. J., January 1, 2005; 88(1): 118 - 131. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||