|
|
||||||||
Centre de Biophysique Moléculaire Numérique (CBMN), Faculté des Sciences Agronomiques de Gembloux (FSAGx), 5030 Gembloux, Belgium
Reprint requests to: Robert Brasseur, CBMN, Passage des Déportés, 2, FSAGx, B-5030 Gembloux, Belgium; e-mail: brasseur.r{at}fsagx.ac.be; fax: 32-8162-2522.
(RECEIVED January 31, 2003; FINAL REVISION April 26, 2003; ACCEPTED April 28, 2003)
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.0304803.
| Abstract |
|---|
|
|
|---|
Keywords: Solvent accessibility; Pex files; hydrophobicity; secondary structure; amphipathy
| Introduction |
|---|
|
|
|---|
The folding process of soluble proteins decreases the surface in contact with the solvent. This is related to the secondary structures of proteins. Accurate knowledge of residue accessibility would thus aid the prediction of secondary structures. Different methods of prediction are based on the use of protein structure databases and on multiple sequence alignments. They have various efficiencies, notably depending on the number of relative accessibility states (i.e., exposed, buried, and in-between; Rost and Sander 1994; Rost 1996; Li and Pan 2001; Naderi-Manesh et al. 2001; Yuan et al. 2002).
Further, because active sites of proteins are often located at the surface of the protein, greater insight into residue accessibility would be important in understanding and predicting structure/function relationships.
In the present study, we analyzed 587 proteins from the Protein Data Bank (PDB) using the Pex files. We extracted the total, hydrophobic, and hydrophilic accessible surfaces of residues. The method used to calculate the accessible surface is that of Shrake and Rupley (1973). The 587-protein bank is a nonredundant bank of structures (Liu and Chou 1999).
| Results |
|---|
|
|
|---|
|
|
|
|
|
|
-helix (Ha), parallel (Bp) and antiparallel (Ba) ß-strands, ß-structures (B), and random coil/turn (C-T) conformations. These structural elements are defined in Materials and Methods. They correspond to 32,806 residues analyzed for Ha; 2851 residues for Bp; 27,603 residues for Ba; 36,230 residues for B; and 45,220 residues for C-T.
For each structural class, the total, hydrophobic, and hydrophilic median ASAs were calculated (Tables 37![]()
![]()
![]()
![]()
) and compared to the residue ASA in the extended conformation (ASA calculated with a window of 3). The most accessible residues belong to the random coil/turn (C-T) class, whereas the Ba and Bp structures result in the most solvent-inaccessible residues.
|
|
|
|
|
It should be noted that the hydrophilic accessible surface of the most hydrophobic residues (i.e., Ile, Val, Leu, Met, Phe) should correspond to the accessibility of their backbone (Table 3
).
ß-Strands (B-structures)
The residues are almost as accessible in B-structures as in random coil/turn structures, and the segregation between the hydrophobic and the hydrophilic residues is also observed (Table 4
). A noticeable difference lies in how the hydrophobic and hydrophilic accessible surfaces of residues decrease (cf. columns 9, 10 of Tables 3 and 4![]()
). Although the hydrophobic and hydrophilic ASAs of hydrophilic residues are similarly reduced, hydrophobic residues show a twofold more pronounced decrease in the hydrophobic surface compared to the hydrophilic surface. This suggests that B-structure is more prone to shield the hydrophobic moiety of hydrophobic amino acids compared to the random coil/turn structure.
For Proline, which is 30% accessible, the opposite is observed; that is, the hydrophilic surface is less accessible than the hydrophobic one (as in the random coil/turn structure).
Further, as for random coil/turn, the hydrophilic accessible surface of the most hydrophobic residues in B-structures should correspond to the accessibility of the backbone.
Parallel and antiparallel sheets
These folds correspond to the less accessible residues: On average, only 10% of the residue surface is accessible (Tables 5,6![]()
). This is particularly true for the hydrophobic residues such as Leu, Ile, Val, Met, Cys, and Phe (1%5% accessibility). In these folds, again, the most accessible residues are Lys and Glu (23%33% accessibility) and hydrophobic and hydrophilic residues are segregated, Ser and Thr having intermediate values. For almost all residues, the Ba/Bp structures appear to shield their hydrophobic domains from the solvent better than any other structure, as reported by Chothia (1976).
This suggests that a sequence will have a smaller hydrophobic accessible surface as a ß-sheet than in any other conformation. This could be related to the formation of fibrils observed with highly hydrophobic peptides. Indeed, fibrils are made of antiparallel ß-strands, as reported regarding the amyloid aggregates of Alzheimers disease (Li et al. 1999; Schladitz et al. 1999).
It is interesting to note that the backbone of hydrophobic residues (corresponding to the phi ASA of those residues) is no more accessible in Ba/Bp structures, in contrast to what happens for random coil/turn and B-conformations.
-Helices
Residues of helical folds have an intermediate solvent accessibility of
20% (Table 7
). The difference between the accessibility of hydrophobic and hydrophilic residues is highly marked. For
-helices, K, E, R, N, D, Q, S, and T are 45% (Lys) to 19% (Ser) accessible with an average of 35% accessibility, whereas hydrophobic residues are only 1%5% accessible (except for Trp and Tyr, which are 10% accessible). This is linked to the observation that most
-helices of protein 3D structures are amphipathic (Chou et al. 1997). Amphipathic helices have most of the hydrophobic residues oriented toward the protein core, whereas the hydrophilic residues are water-accessible.
It should be noted that the helical structure similarly reduces the hydrophobic and the hydrophilic surfaces of all residues, in contrast to what happens in ß-folds.
In the helical structure, as for ß-sheets, the backbone is not accessible (hydrophilic surfaces of hydrophobic residues are almost null).
The same calculations of surfaces were made by selecting the secondary structures attributed to the CO-side of the residue. The number of residues analyzed for each structural class remains similar, as were the surfaces (data not shown).
Analysis of ASAs in data sets containing only ß- or
-proteins
In light of our observation that the ß-fold better shields hydrophobic parts of amino acids from water whereas helical structure better segregates hydrophobic and hydrophilic residues (the latter remaining accessible to the solvent), we wondered whether this would also hold true for two data sets one containing proteins with
-structures (no ß-residues) and the other with ß-proteins (no helical folds). These sets were extracted from the 587-protein bank by selecting 26 ß-proteins and 55
-proteins, as described in Materials and Methods.
Table 8
shows that the accessible surfaces of residues in helical conformation of
-proteins (corresponding to 6,530 residues) are the same as those determined for the 587 proteins, confirming the amphipathic character of the helical fold.
|
|
|
|
|
| Discussion |
|---|
|
|
|---|
Three groups of amino acids can be distinguished based on the relationships between their hydrophobic and hydrophilic accessible surfaces, either in the extended state or in the folded proteins. One group is made of the hydrophobic residues (I,L,V,F,M,A,G), another contains the hydrophilic residues (D,N,E,Q,R), and the third shows intermediate behavior (H,Y,W,S,T). Proline and Lysine are apart. Among the hydrophilic residues, Lysine is peculiar: It is often looked at as a hydrophilic residue but in the unfolded state, it has almost equal hydrophobic and hydrophilic accessible surfaces. Moreover, in the folded state, it has the highest hydrophobic ASA of all residues. Proline also has special features, as it has the highest hydrophobic accessible surface among the hydrophobic residues. This is related to its preferential location in accessible turns of proteins.
The classification of amino acids into three amino acid "families" following their hydrophobic/hydrophilic accessible surface should be important in terms of the prediction of conservative mutations.
The different types of secondary structure correspond to different accessible surfaces. Random coils, turns, and ß-strands that are either not H-bound or are H-bound to a structure that is not a strand are the most accessible folds, with an average of 30% of residue accessibility. In these structures, the backbone of the most hydrophobic residues (I,V,L,M,F) is quite accessible, with 10%15% accessibility.
The ß-sheets (parallel and antiparallel strands) are the most solvent-inaccessible structures (with about 10% of residue accessibility), whereas the helical conformation has an intermediate value, with about 20% of the residue surface accessible.
Both helical and ß-sheet conformations shield the backbone of the most hydrophobic residues from water, in contrast to what happens for the "unordered" structures.
In all folds, there is a noticeable difference between the hydrophobic and hydrophilic residues, the latter being always more solvent-accessible. The greatest difference is observed in
-helices related to their amphipathic character. When the protein folds, the hydrophobic side of the helix is buried in the protein core, and the hydrophilic side remains solvent-accessible. The ß-sheets are the most appropriate structures to shield the hydrophobicity of residues. This is likely important in the formation of fibrils in pathological and nonpathological phenomena.
Note that Lysine and Glutamic acid are the most accessible residues, whereas Leucine, Isoleucine, and Valine are the most inaccessible, irrespective of secondary structures.
As described earlier by Chothia (1976) regarding 12 proteins, there is a simple relationship between the total accessible surface of folded proteins and their size. We also show that there is a balance between the hydrophobic and the hydrophilic surfaces of the 3D protein surface. This balance is maintained irrespective of the protein size, resulting in a patchwork surface of hydrophobic and hydrophilic areas. Size and accessibility of the patches should be important for protein-protein interaction sites and/or for activity, as suggested by others (Eisenhaber and Argos 1996; Jones and Thornton 1997).
| Materials and methods |
|---|
|
|
|---|
|
For the ß-only proteins data set, 26 files were extracted with the criterion that the proteins do not contain
-helical structure; for the
-only proteins, 54 files were extracted with the criterion that they do neither contain Ba- or Bp-structured residues.
The
-bank corresponds to the following files:
1aep,1arv,1ash,1axn,1bbh,1bcf,1ccr,1cem,1cns,1cnt,1col,1cpq, 1eca [PDB] ,1etp,1hlb,1huw, 1ign [PDB] ,1ilk,1jvr,1kxu,1lis,1lki,1lrv,1maz,1mey, 1mls [PDB] ,1msf,1nfn,1oct,1pbw,1poc,1rci,1rfb,1rhg,1spg,1sra,1uby, 1vin [PDB] ,1vls,1xsm,2abd,2abk,2ccy,2cyp,2end,2fal,2hbg,2hmz,2lfb, 2tct [PDB] ,2tmv,3sdh,6fab,9wga.
The ß-bank corresponds to the following files:
1cea,1cur,1fbr,1fnf,1hce,1i1b,1knb,1lcl,1msa,1msp,1nfa,1npo, 1pco [PDB] ,pdg,1ptx,1svp,1tgx,1tpg,1tul,1ulp,1vmo,1wba,1yha,2mpr, 2pii [PDB] ,4fgf.
ASA-Pex files
Each Pex file originates from the PDB file of a protein. Each line of the Pex corresponds to a residue in the order of the sequence, and each column is a parameter calculated in the 3D structure as described by Thomas (Thomas et al. 2001). In the ASA-Pex file, calculation of accessible surface areas (ASAs) of the whole residue (lateral chain and backbone) was achieved using the method of Shrake and Rupley (1973). In brief, the spherical surface of each atom is covered by a net of 642 points (the initial method used 92 points), and the points that lie within other expanded atoms are determined. The SERF algorithm where this method is implemented was used (Flower 1997).
Determination of the hydrophobic and hydrophilic accessible surfaces of residues
The method of Shrake and Rupley (1973) was also used to calculate the hydrophobic and hydrophilic ASAs. With this method, the hydrophobic and hydrophilic ASAs correspond to the sum of the surface of hydrophobic or hydrophilic atoms of the residue, respectively. Atoms are considered hydrophobic or hydrophilic depending on their transfer energy, as described (Brasseur 1991).
Median values of total, hydrophobic, and hydrophilic accessible surfaces were calculated using Excel software (Microsoft).
Determination of the secondary structure of the residues
In the PDB files, no protein has a complete description of secondary structures. We therefore established our definition of secondary structures based on the (phi/psi) values and on the occurrence of main chain H-bonds (O..H distance less than 3.5 Å), as described in Thomas et al (2001). Two different structures were attributed to the same residue according to the fact that that residue can be involved in two H-bonds, one on its NH side, the other on its CO side. Both secondary structures are listed in the Pex. In the present study, the NH secondary structure was considered.
Definition of the different secondary structures
Helices
Helical residues have a main chain H-bond and
/
within a circle of 45° around the couple
= -57° and
= -47°. The main chain H-bond has an O . . . H distance less than 3 Å. The helix is
(Ha) when the n and the n ± 4 residues are H-bound.
ß-Structures
Residues are in ß when the
/
are within a circle of 90° around the
= -129° and
= 123° . When two strands are H-bound, they are either antiparallel (Ba) or parallel (Bp) sheets. The sheets are parallel when the vectors between the C
of the residues n and n + 1 of each strand draw an angle of -90° to +90°. They are antiparallel when the same vector angles are between 90° and 180° or -90° and -180° apart. B is for ß-strands that are either not H-bound or are H-bound to a structure that is not a strand.
Random coils/turns
The
/
values of the different turns are from Srinivasan and Rose (1995). The presence of an H-bond is not mandatory.
The
/
of random coils span a large range of values, including the left helices that were not individualized in this study. Random coils also account for right helices and ß-residues when the H-acceptor (the O=C residue, n + i) and the NH-donor (n) are too far away in the sequence for a helix (i > 6) or too close for a ß-sheet (i < 3).
Molecular hydrophobicity potential (MHP) calculations
MHP is a three-dimensional plot of the hydrophobicity potential of a molecule created to visualize its amphipathy. The hydrophobicity of a molecule is calculated using its partition coefficient between water and octanol.
We postulate that the hydrophobicity induced by an atom i and measured at a point M of space decreases exponentially with the distance between this point M and the surface of an atom i according to the equation (Brasseur 1991):
![]() |
The hydrophobic and hydrophilic isopotential surfaces were calculated by a cross-sectional computational method. A 1-Å mesh-grid plane was set to sweep across the molecule by steps of 1 Å. At each step, the sum of the hydrophobicity and hydrophilicity values at all grid nodes was calculated. The hydrophobic and hydrophilic MHP surfaces were then drawn by joining the isopotential values.
All calculations were performed on Pentium III processors, using Z-TAMMO and Z-PEX software. Molecular graphs were drawn using WinMGM (Ab Initio).
| Acknowledgments |
|---|
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked " advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| References |
|---|
|
|
|---|
Chothia, C. 1976. The nature of the accessible and buried surfaces in proteins. J. Mol. Biol. 105: 112.[CrossRef][Medline]
Chou, K.C., Zhang, C.T., and Maggiora, G.M. 1997. Disposition of amphiphilic helices in heteropolar environments. Proteins 28: 99108.[CrossRef][Medline]
Creighton, T. 1993. Chemical properties of polypeptides. In Chemical properties of polypeptides (ed. T. Creighton), pp. 146. W.H. Freeman and Company, New York.
Eisenhaber, F. and Argos, P. 1996. Hydrophobic regions on protein surfaces: Definition based on hydration shell structure and a quick method for their computation. Protein Eng. 9: 11211133.
Flower, D.R. 1997. SERF: A program for accessible surface area calculations. J. Mol. Graph. Model. 15: 238244.[CrossRef][Medline]
Jones, S. and Thornton, J.M. 1997. Analysis of proteinprotein interaction sites using surface patches. J. Mol. Biol. 272: 121132.[CrossRef][Medline]
Li, L., Darden, T.A., Bartolotti, L., Kominos, D., and Pedersen, L.G. 1999. An atomic model for the pleated ß-sheet structure of Aß amyloid protofilaments. Biophys. J. 76: 28712878.
Li, X. and Pan, X.M. 2001. New method for accurate prediction of solvent accessibility from protein sequence. Proteins 42: 15.[CrossRef][Medline]
Liu, W. and Chou, K.C. 1999. Prediction of protein secondary structure content. Protein Eng. 12: 10411050.
Naderi-Manesh, H., Sadeghi, M., Arab, S., and Moosavi Movahedi, A.A. 2001. Prediction of protein surface accessibility with information theory. Proteins 42: 452459.[CrossRef][Medline]
Rost, B. 1996. PHD: Predicting one-dimensional protein structure by profile-based neural networks. Methods Enzymol. 266: 525539.[CrossRef][Medline]
Rost, B. and Sander, C. 1994. Conservation and prediction of solvent accessibility in protein families. Proteins 20: 216226.[CrossRef][Medline]
Samanta, U., Bahadur, R.P., and Chakrabarti, P. 2002. Quantifying the accessible surface area of protein residues in their local environment. Protein Eng. 15: 659667.
Schladitz, C., Vieira, E.P., Hermel, H., and Mohwald, H. 1999. Amyloid-ß-sheet formation at the airwater interface. Biophys. J. 77: 33053310.
Shrake, A. and Rupley, J.A. 1973. Environment and exposure to solvent of protein atoms. Lysozyme and insulin. J. Mol. Biol. 79: 351371.[CrossRef][Medline]
Srinivasan, R. and Rose, G.D. 1995. LINUS: A hierarchic procedure to predict the fold of a protein. Proteins 22: 8199.[CrossRef][Medline]
Tanford, C. 1973. In The hydrophobic effect: Formation of micelles and biological membranes (ed. C. Tanford), pp. 520. Wiley, New York.
Thomas, A., Bouffioux, O., Geeurickx, D., and Brasseur, R. 2001. Pex, analytical tools for PDB files. I. GF-Pex: Basic file to describe a protein. Proteins 43: 2836.[CrossRef][Medline]
Thomas, A., Meurisse, R., Charloteaux, B., and Brasseur, R. 2002a. Aromatic side-chain interactions in proteins. I. Main structural features. Proteins 48: 628634.[CrossRef][Medline]
Thomas, A., Meurisse, R., and Brasseur, R. 2002b. Aromatic side-chain interactions in proteins. II. Near- and far-sequence Phe-X pairs. Proteins 48: 635644.[CrossRef][Medline]
Yuan, Z., Burrage, K., and Mattick, J.S. 2002. Prediction of protein solvent accessibility using support vector machines. Proteins 48: 566570.[CrossRef][Medline]
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
A. Thomas, M. Allouche, F. Basyn, R. Brasseur, and B. Kerfelec Role of the Lid Hydrophobicity Pattern in Pancreatic Lipase Activity J. Biol. Chem., December 2, 2005; 280(48): 40074 - 40083. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Seebohm, P. Westenskow, F. Lang, and M. C Sanguinetti Mutation of colocalized residues of the pore helix and transmembrane segments S5 and S6 disrupt deactivation and modify inactivation of KCNQ1 K+ channels J. Physiol., March 1, 2005; 563(2): 359 - 368. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Moelbert, E. Emberly, and C. Tang Correlation between sequence hydrophobicity and surface-exposure pattern of database proteins Protein Sci., March 1, 2004; 13(3): 752 - 762. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||