Protein Science Sheba protein
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chen, C. P.
Right arrow Articles by Rost, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chen, C. P.
Right arrow Articles by Rost, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Protein Science (2002), 11:2766-2773.
Copyright © 2002 The Protein Society

Long membrane helices and short loops predicted less accurately

Chien Peter Chen1 and Burkhard Rost1,2,3

1 CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
2 Columbia University Center for Computational Biology and Bioinformatics (C2B2), Russ Berrie Pavilion, New York, NY 10032, USA
3 North East Structural Genomics Consortium (NESG), Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA

Reprint requests to: Burkhard Rost, Columbia University, New York, NY 10032, USA; e-mail: rost{at}columbia.edu; fax: (212) 305-7932.

(RECEIVED May 5, 2002; FINAL REVISION September 16, 2002; ACCEPTED September 16, 2002)

Terminology: Advanced prediction methods: all methods that do not exclusively use a hydrophobicity scale; simple prediction methods: membrane prediction methods exclusively based on hydrophobicity scales; loop: referring to the region that connects two transmembrane helices in sequence; in particular, such loops could consist of entire structural domains.

Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.0214602.


    Abstract
 TOP
 Abstract
 Introduction
 Results and Discussion
 Materials and methods
 References
 
Low-resolution experiments suggest that most membrane helices span over 17–25 residues and that most loops between two helices are longer than 15 residues. Both constraints have been used explicitly in the development of prediction methods. Here, we compared the largest possible sequence—unique data sets from high- and low-resolution experiments. For the high-resolution data, we found that only half of the helices fall into the expected length interval and that half of the loops were shorter than 10 residues. We compared the accuracy of detecting short loops and long helices for 28 advanced and simple prediction methods: All methods predicted short loops less accurately than longer ones. In particular, loops shorter than 7 residues appeared to be very difficult to detect by current methods. Similarly, all methods tended to be more accurate for longer than for shorter helices. However, helices with more than 32 residues were predicted less accurately than all other helices. Our findings may suggest particular strategies for improving predictions of membrane helices.

Keywords: Membrane proteins; protein structure prediction; predicting transmembrane helices; bioinformatics

Abbreviations: 3D, three-dimensional • DSSP, program assigning secondary structure (Kabsch and Sander 1983) • HMM, hidden Markov model • PDB, Protein Data Bank of experimentally determined 3D structures of proteins (Bernstein et al. 1977; Berman et al. 2000) • SWISS-PROT, database of protein sequences (Bairoch and Apweiler 2000) • TM, transmembrane • TMH, transmembrane helix


    Introduction
 TOP
 Abstract
 Introduction
 Results and Discussion
 Materials and methods
 References
 
Predictions of membrane helices relatively successful. Despite the great biological and medical importance of helical membrane proteins, we still know few three-dimensional (3D) structures. Fortunately, bioinformatics can contribute substantially to bridging the gap between what we do and what we want to know by predicting membrane helices. In fact, predicting the locations of transmembrane helices (TMH) appears to be a simpler problem than predicting globular helices (Rost 1996, 2001). Nevertheless, although some investigators estimated the levels of accuracy to reach an incredibly high value of 99% (Jayasinghe et al. 2001), recent re-evaluations of many prediction methods (Ikeda et al. 2001; Möller et al. 2001; Chen et al. 2002) somewhat dampened this optimism by concluding that only the very best advanced methods predict all membrane helices correctly for >70% of all proteins, and that simple hydrophobicity scale-based methods tend to be ~20 percentage points less accurate.

Distribution of membrane helix length crucial parameter for prediction. Prediction methods typically explore that TMH are predominantly apolar and believed to be between 17 and 25 residues long (von Heijne 1996). The upper and lower bounds for the length of membrane helices are explicitly used by most prediction methods in two ways. (1) Some methods identify only hydrophobic regions as membrane helices that fall into the typical length interval (von Heijne 1992; Casadio et al. 1996; Persson and Argos 1996; Hirokawa et al. 1998; Ikeda et al. 2001; Jayasinghe et al. 2001). (2) Other methods search the best path through some predicted membrane helix propensity landscape that is compatible with such upper and lower bounds (Jones et al. 1994; Rost et al. 1996a, b; Krogh et al. 2001; Tusnady and Simon 2001). James Bowie found that the length distribution of three high-resolution structures was shifted toward longer helices (Bowie 1997).

Here, we re-evaluated the distribution of the length of TMH and that of the loops in between helices based on significantly larger dataset than previously used (Bowie 1997). Then, we analyzed 28 prediction methods in terms of their performance on short loops and long membrane helices.


    Results and Discussion
 TOP
 Abstract
 Introduction
 Results and Discussion
 Materials and methods
 References
 
Many long helices and short loops observed in high-resolution structures
Many membrane helices longer than 32 residues!
Half of all membrane helices annotated by low-resolution experiments were 20–24 residues long, whereas only about one-fourth of the high-resolution helices fall into this length interval (Fig. 1Go, inset). Helices from 17–27 residues accounted for less than half of the high-resolution and for 93% of the low-resolution data. The distribution of lengths was clearly shifted toward longer helices in the high-resolution data (Fig. 1Go). In particular, 12 high-resolution helices (9%) were longer than 34 residues, that is, fall outside the range of what the low-resolution experiments suggested as possible lengths for membrane helices. The following four proteins had the longest helices: (1) the cytochrome BC complex (1BGY:D 34 residues, 1BGY:G 43 residues; Iwata et al. 1998); (2) the calcium ATPase (1EUL:A, TMH 6, 39 residues; Toyoshima et al. 2000); (3) the cytochrome C oxidase (2OCC:I, TMH 1, 39 residues; Tsukihara et al. 1996); and (4) the fumarate reductase (1FUM:C, TMH 2, 38 residues; Iverson et al. 1999). Typically, the long helices were either slightly bent (1BGY:D, 1fum:C) or extended into globular domains (1eul:A, Fig. 2Go). Overall, the recent high-resolution data appeared to strongly challenge the assumption of many developers of prediction methods, namely that the vast majority of membrane helices are 17–25 residues long. This incorrect assumption has been implemented as a more or less rigid constraint into most existing prediction methods. In fact, to implement such a constraint is important as many regions in membrane proteins consist of >40–60 consecutive hydrophobic residues that usually form more than one membrane helix. These long helices have to be "dissected" by prediction methods, not the least to accurately predict topology. Thus, the unexpected reality observed in high-resolution structures (Figs. 1, 2GoGo) complicates the prediction task.



View larger version (31K):
[in this window]
[in a new window]
 
Fig. 1. Length distributions for membrane helices. The lengths of the membrane helices were assigned using DSSP for the high-resolution data (36 unique chains; 131 helices), and using the annotation in SWISS-PROT for the low-resolution data (165 unique proteins; 339 helices). The inset gives the cumulative percentages of helices. About half the high-resolution helices (47%) are 17–27 residues long, whereas 93% of the low-resolution helices fall into this interval (gray line). Half of the low-resolution helices (50%) are 20–24 residues long, whereas 25% of the high-resolution helices fall into this interval (dashed line).

 


View larger version (101K):
[in this window]
[in a new window]
 
Fig. 2. Long membrane helices in high-resolution structures. The plots were generated using the RASMOL program (Sayle and Milner-White 1995). All transmembrane helices shown in black extend over >38 residues.

 
High-resolution structures revealed considerable proportion of short loops.
Monne, von Heijne, and coworkers experimentally established the propensities of amino acids to form tight turns (loops) between membrane helices (Monne et al. 1999a,b; Monne and von Heijne 2001). They find that the charged and polar amino acids DEQNRK as well as the flexible P and G have the highest preferences to form tight turns. However, in their data set these investigators found very few proteins with loops shorter than 7 residues. Plotting the length distribution of loops, we noticed two important results (Fig. 3Go): (1) Low-resolution experiments tended to suggest significantly longer loops than high-resolution structures, and (2) about half of all loops in high-resolution structures were 10 residues or shorter and >20% of the high-resolution loops were <=5 residues long. Obviously, we cannot expect that the 36 sequence-unique high-resolution chains used in our study (see Materials and Methods) are fully representative for all helical membrane proteins. Given that we predict about 20,000 helical membrane proteins in the five entirely sequenced eukaryotes alone (Liu and Rost 2001, 2002), we also doubt that the 165-sequence-unique low-resolution proteins (see Materials and Methods) are more representative. Clearly, the high-resolution data are more accurate than the low-resolution data. Thus, our data suggested that a considerable percentage of all loops between membrane helices are very short.



View larger version (60K):
[in this window]
[in a new window]
 
Fig. 3. Length distributions for loops between two membrane helices. The lower graph gives the percentage of all loops between two membrane helices that have N (shown 0–25) residues; the upper graph shows the cumulative data for example, 65% of all loops in high-resolution structures (black lines with solid triangles) are <=15 residues long, whereas 65% of the loops in low-resolution experiments are <=25 residues. Significantly more short loops are observed in the high- than in the low-resolution data. Although ~40% of the high-resolution loops were shorter than 9 residues, only half as many loops in the low-resolution set were that short.

 
Long helices and short loops challenge prediction methods
Short loops predicted at lower accuracy.
As discussed previously, about half of all loops connecting two membrane helices are shorter than 10 residues. Most prediction methods compile averages using windows of 13–25 consecutive residues. Thus, the signal from the flanking helices may override that for a short loop. If so, we expect short loops to be predicted less accurately. Our data clearly confirmed this suspicion: Shorter loops are predicted by all methods less accurately than long loops (Fig. 4Go and Table 1Go). The low-resolution data suggested that prediction accuracy decreased significantly for loops shorter than 10 residues, whereas the high-resolution data suggested the significant decrease to occur for loops shorter than 7 residues (Fig. 4Go). For example, although ~90% of the loops longer than 15 residues were correctly detected by the advanced prediction methods, <60% of the loops <=5 residues were identified (Fig. 4, left graphGo). These data suggested that methods that predict membrane helices have explicitly embedded loop preferences, such as the ones derived by the von Heijne group (Monne et al. 1999a, 1999b; Monne and von Heijne 2001).



View larger version (27K):
[in this window]
[in a new window]
 
Fig. 4. Accuracy in predicting short loops. (Left) Percentage of loops with N (0–30) residues that were correctly predicted by all advanced prediction methods; the bars indicate the error estimates for these values. Note that the high-resolution data was too small to display noncumulative distributions. (Right) Difference in prediction accuracy between loops shorter and longer than the respective loop length (Eq. 1Go). All values were negative, implying that longer loops were always predicted at higher accuracy than shorter ones.

 

View this table:
[in this window]
[in a new window]
 
Table 1. Performance for short loopsa
 
Prediction accuracy depended on helix length.
When we correlated prediction accuracy to the length of the observed TMH, we observed three overall trends (Fig. 5Go): (1) For any chosen threshold in the number of residues N with N <= 32 residues used to group membrane helices into short and long, helices shorter than N were predicted less accurately than were helices longer than N; (2) the trend was inverted for helices longer than 32 residues (only available for high-resolution data). These very long helices were predicted less accurately than all other helices; and (3) helices shorter than 17–20 residues posed an even stronger challenge to prediction methods than shorter helices (a significant drop of accuracy is shown in Figure 5Go). At first sight, the decrease in prediction accuracy for helices longer than 32 residues may appear irrelevant in context of predicting membrane helical proteins for entire proteomes (Goffeau et al. 1993; Rost et al. 1996b; Arkin et al. 1997; Frishman and Mewes 1997; Jones 1998, Wallin and von Heijne 1998; Gupta et al. 1999; Krogh et al. 2001; Liu and Rost 2001). However, if we can generalize from the currently known high-resolution structures, we expect that ~20% of all membrane helices are longer than 32 residues (Fig. 1Go). For the five entirely sequenced eukaryotic proteomes, this translates to ~5000 proteins with a helix longer than 32 residues (Liu and Rost 2001, 2002).



View larger version (37K):
[in this window]
[in a new window]
 
Fig. 5. Accuracy in predicting long membrane helices. The difference in prediction accuracy between membrane helices shorter and longer than N residues (Eq. 2Go) is shown. Negative values imply that short helices were predicted less accurately than longer ones.

 
Detailed analysis of the mistakes in predicting long helices
Advanced methods miss long helices; simple methods incorrectly split them.
To explore why membrane protein prediction methods have trouble with long helices, we visually classified the predictions for long helices (>=33 residues) as being (1) correct, (2) incorrectly cut into two membrane helices, and (3) not predicted at all (Fig. 6Go). Advanced methods are correct at predicting these long helices with an accuracy of >90%. However, when these methods fail, it is about three times more likely to not predict a helix at all than it is to incorrectly predict the helix as two shorter helices. This may suggest that advanced methods mostly distinguish correctly between the membrane and the nonmembrane parts of long helices. Simple hydrophobicity-based methods identified only ~71% of the long helices correctly. In contrast to the advanced methods, the errors of simple methods had a six times higher rate of incorrectly splitting long helices than they had of missing the helix. This may suggest that the difficulty of simple methods with long helices is primarily due to overpredicting helical regions. This is supported by the fact that simple methods have great sensitivity but poor specificity at detecting TMH (Chen et al. 2002). In contrast, advanced methods have better specificity at detecting a TMH, as is indicated by their high accuracy of predicting even long helices. However, the price for being highly specific is that they can miss some TMH.



View larger version (63K):
[in this window]
[in a new window]
 
Fig. 6. Visual inspection of errors for long membrane helices. Prediction methods incorrectly split long helices ~17% and miss them 5% of the time. When advanced methods predict long helices incorrectly, it is about three times more likely to not predict a helix at all than to split it. In contrast, simple hydrophobic methods are six times more likely to incorrectly predict a long helix as two shorter ones than to not predict the helix at all.

 
Visual inspections of a few cases suggest to combine advanced and simple methods.
If one examines the cytochrome BC complex, one of the helices (residues 197–231 of 1BGY:D) was not predicted by 50% of the advanced methods. One explanation for the difficulty in detecting this particular helix is the lack of a consecutive stretch of at least 17 hydrophobic residues. Even when the methods did predict its presence, they failed to predict residues at the amino and carboxyl termini of the helix (6 residues or more at either end). The residues not predicted as TM were often either polar or charged amino acids. For instance, residues 197–203 are EHDHRKR and residues 223–231 are KRHKWSVLK. This theme of predicting the core hydrophobic region of a TMH but not detecting the more polar amino- and carboxyl-ends for long helices was repeated for most of the predictions by the advanced methods. The simple hydrophobic methods tended to identify these long helices. For instance, for 1BGY:D, many of the simple hydrophobicity-based methods missed the first 7 but correctly detected the last 7 residues. In fact three simple methods captured the entire length of the observed helix. This example might suggest a potential strategy to deal with long membrane helices: (1) Identify consensus regions for long membrane helices through simple hydrophobicity scales; (2) determine the core of the membrane segment through advanced prediction methods; and (3) extend the predicted helix in the directions of both the amino and carboxyl termini by using a scoring matrix that is optimized for non-TM residues and by setting the boundaries of extension within the region defined by the simple methods.


    Materials and methods
 TOP
 Abstract
 Introduction
 Results and Discussion
 Materials and methods
 References
 
Data sets.
For the high-resolution data, we started with 105 chains from helical membrane proteins with high-resolution structures deposited in PDB (Berman et al. 2000). We then reduced the bias in this dataset resulting from multiple copies of similar proteins. This left a set of 36 high-resolution proteins that were sequence-unique in the sense that no pair in that list had an HSSP distance above 0. (Rost 1999; for more details, see Chen et al. 2002.) We identified membrane regions through DSSP (Kabsch and Sander 1983). For the low-resolution data, we used a sequence-unique subset of the expert-curated set of helical membrane proteins for which good low-resolution experimental evidence about localization was available (Moller et al. 2000). The final sequence-unique subset contained 165 proteins.

Advanced prediction methods.
We referred to prediction methods as advanced when they implement more than simple hydrophobicity scales. We tested the following programs: DAS, HMMTOP (version 2), PHDhtm, PHDpsihtm, PRED-TMR, SOSUI, TMHMM (version 2), and TopPred2. TopPred2 averages the GES-scale of hydrophobicity (Engelman et al. 1986) using a trapezoid window (von Heijne 1992; Sipos and von Heijne 1993). PHDhtm combines a neural network using evolutionary information with a dynamic programming optimization of the final prediction (Rost et al. 1995, 1996b). PHDpsihtm uses PSI-BLAST (Altschul et al. 1997) alignments as input (B. Rost, unpubl.). DAS optimizes the use of hydrophobicity plots (Cserzö et al. 1997). SOSUI (Hirokawa et al. 1998) uses a combination of hydrophobicity and amphiphilicity preferences to predict membrane helices. TMHMM is the most advanced—and seemingly most accurate—current method to predict membrane helices (Sonnhammer et al. 1998). It embeds a number of statistical preferences and rules into a hidden Markov model to optimize the prediction of the localization of membrane helices and their orientation. (Note: Similar concepts are used for HMMTOP; Tusnady and Simon 1998). PRED-TMR uses a standard hydrophobicity analysis with emphasis on detecting the ends and beginnings of membrane helices (Pasquier et al. 1999).

Simple methods exclusively based on hydrophobicity scales.
We also implemented our in-house prediction methods that simply used various hydrophobicity scales for prediction. In particular, we tested the following scales: A-Cid, normalized hydrophobicity scale for {alpha} proteins (Cid 1992); Av-Cid, normalized average hydrophobicity scale (Cid 1992); Ben-Tal, hydrophobicity scale representing free energy of transfer of an amino acid from water into the center of the hydrocarbon region of a model lipid bilayer (Kessel and Ben-Tal 2002); Bull-Breese, Bull-Breese hydrophobicity scale (Bull 1974); Eisenberg, normalized consensus hydrophobicity scale (Eisenberg et al. 1984); EM, solvation-free energy (Eisenberg and McLachlan 1986); Fauchere, hydrophobic parameter {pi} from the partitioning of N-acetyl-amino acid amides (Fauchere and Pliska 1983); GES, hydrophobicity property (Engelman et al. 1986); Heijne, transfer-free energy to lipophilic phase (von Heijne and Blomberg 1979);Hopp-Woods, Hopp-Woods hydrophilicity value (Hopp and Woods 1981); KD, Kyte-Doolittle hydropathy index (Kyte and Doolittle 1982); Lawson, transfer-free energy (Lawson et al. 1984); Levitt, hydrophobic parameter (Levitt 1976); Nakashima, normalized composition of membrane proteins (Nakashima et al. 1990); Radzicka, transfer-free energy from 1-octanol to water (Radzicka and Wolfenden 1988); Roseman, solvation-corrected side chain hydropathy (Roseman 1988); Sweet, optimal matching hydrophobicity (Sweet and Eisenberg 1983); Wolfenden, hydration potential (Wolfenden et al. 1981); and WW, Wimley-White scale (Jayasinghe et al. 2001). Replacing the WW scale with each of the above-mentioned hydrophobicity indices, we used the WW algorithm to evaluate the predictive performance of each index.

Measuring accuracy.
To establish whether or not short loops and long membrane helices pose particular problems for prediction methods, we have to deviate from the scores used to evaluate performance of membrane predictions methods (Chen et al. 2002). In particular, we introduced the following scores that describe the difference in performance between short and long loops ({Delta}QL(N), Eq. 1Go), and that between short and long TMH ({Delta}QT(N), Eq. 2Go).

(1) Short loops.
We evaluated the performance of predicting short loops, that is, regions connecting two membrane helices with <=N residues by compiling the difference between the accuracy in predicting short and long loops:

((Eq. 1))
where N is the number of residues; Nloop < N identified is the number of loops with <N residues that were correctly predicted, and Nloop < N observed, the number of loops observed to have <N residues. We considered a loop of N residues to be correctly predicted if at least 1 residue in that loop was predicted, that is, if the presence of a break between two helices was correctly identified. {Delta}QL(n) could adopt values between -100 and 100; negative values indicate that longer loops are predicted more accurately than shorter ones.

(2) Long helices.
In analogy to the score describing the performance for short loops, we evaluated the performance of predicting long TMH by compiling the difference between the accuracy in predicting short and long helices:

((Eq. 2))
where Ntm >= N identified is the number of TMH with >=N residues that were correctly predicted and Ntm >= N, the number of TMH with >=N residues observed. We considered a helix to be correctly predicted if it overlapped at least for 3 residues with the observed helix and if it was predicted as one continuous helix (over the region of the observed helix). This measure is illustrated in the following example for a prediction (T = TM; != loop):

Observed: ---------TTTTTTTTTTTTTTTTTTTTT--------------

Predict 1: -------------------------TTTTTTTTTT--------

Predict 2: ---TTTTTTTTTTTTTT-TTTTTTTTTTTTTTTT---------

In this example, Predict 1 is right and Predict 2 is wrong because all we are trying to capture is whether or not methods tended to split long TMH. {Delta}QT(N) ranges from -100 to 100; it becomes negative if helices shorter than N residues are predicted more accurately than helices >=N.


    Acknowledgments
 
Thanks to Jinfeng Liu (Columbia) for computer assistance and the collection of genome datasets. The work of B.R. was supported by grants 1-P50-GM62413-01 and RO1-GM63029-01 from the National Institutes of Health (NIH) and by grant DBI-0131168 from the National Science Foundation (NSF). Last, but not least, thanks to all those who deposit their experimental data in public databases and to those who maintain these databases.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.


    References
 TOP
 Abstract
 Introduction
 Results and Discussion
 Materials and methods
 References
 
Altschul, S., Madden, T., Shaffer, A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. 1997. Gapped Blast and PSI-Blast: A new generation of protein database search programs. Nucleic Acids Res. 25: 3389–3402.[Abstract/Free Full Text]

Arkin, I.T., Brünger, A.T., and Engelman, D.M. 1997. Are there dominant membrane protein families with a given number of helices? Proteins 28: 465–466.[CrossRef][Medline]

Bairoch, A. and Apweiler, R. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28: 45–48.[Abstract/Free Full Text]

Berman, H.M., Westbrook, J., Feng, Z., Gillliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. 2000. The Protein Data Bank. Nucleic Acids Res. 28: 235–242.[Abstract/Free Full Text]

Bernstein, F.C., Koetzle, T.F., Williams, G.J.B., Meyer, E.F., Brice, M.D., Rodgers, J.R., Kennard, O., Shimanouchi, T., and Tasumi, M. 1977. The Protein Data Bank: A computer based archival file for macromolecular structures. J. Mol. Biol. 112: 535–542.[Medline]

Bowie, J.U. 1997. Helix packing in membrane proteins. J. Mol. Biol. 272: 780–799.[CrossRef][Medline]

Bull, H.B. and Breese, K. 1974. Surface tension of amino acid solutions: A hydrophobicity scale of the amino acid residues. Arch. Biochem. Biophys. 161: 665–670.[CrossRef][Medline]

Casadio, R., Fariselli, P., Taroni, C., and Compiani, M. 1996. A predictor of transmembrane a-helix domains of proteins based on neural networks. Eur. J. Biophys. 24: 165–178.

Chen, C.P., Kernytsky, A., and Rost, B. 2002. Transmembrane helix predictions revisited. Protein Sci. (this issue).

Cid, H., Bunster, M., Canales, M., and Gazitua, F. 1992. Hydrophobicity and structural classes in proteins. Prot. Engin. 5: 373–375.[Abstract/Free Full Text]

Cserzö, M., Wallin, E., Simon, I., von Heijne, G., and Elofsson, A. 1997. Prediction of transmembrane a-helices in prokaryotic membrane proteins: The dense alignment surface method. Prot. Engin. 10: 673–676.[Abstract/Free Full Text]

Eisenberg, D. and McLachlan, A.D. 1986. Solvation energy in protein folding and binding Nature 319: 199–203.[CrossRef][Medline]

Eisenberg, D., Weiss, R.M., and Terwilliger, T.C. 1984. The hydrophobic moment detects periodicity in protein hydrophobicity. Proc. Natl. Acad. Sci. 81: 140–144.[Abstract/Free Full Text]

Engelman, D.M., Steitz, T.A., and Goldman, A. 1986. Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Annu. Rev. Biophys. Biophys. Chem. 15: 321–353.[CrossRef][Medline]

Fauchere, J.L. and Pliska, V. 1983. Hydrophobic parameters pi of amino-acid side chains from the partitioning of N-acetyl-amino-acid amides. Eur. J. Med. Chem. 18: 369–375.

Frishman, D. and Mewes, H.W. 1997. Protein structural classes in five complete genomes. Nature Struct. Biol. 4: 626–628.[CrossRef][Medline]

Goffeau, A., Nakai, K., Slonimski, P., and Risler, J.-L. 1993. The membrane proteins encoded by yeast chromosome III genes. FEBS Lett. 325: 112–117.[CrossRef][Medline]

Gupta, R., Jung, E., Gooley, A.A., Williams, K.L., Brunak, S., and Hansen, J. 1999. Scanning the available Dictyostelium discoideum proteome for O-linked GlcNAc glycosylation sites using neural networks. Glycobiology 9: 1009–1022.[Abstract/Free Full Text]

Hirokawa, T., Boon-Chieng, S., and Mitaku, S. 1998. SOSUI: Classification and secondary structure prediction system for membrane proteins. Bioinformatics 14: 378–379.[Abstract/Free Full Text]

Hopp, T.P. and Woods, K.R. 1981. Prediction of protein antigenic determinants from amino acid sequences. Proc. Natl. Acad. Sci. 78: 3824–3828.[Abstract/Free Full Text]

Ikeda, M., Arai, M., Lao, D.M., and Shimizu, T. 2001. Transmembrane topology prediction methods: A reassessment and improvement by a consensus method using a dataset of experimentally-characterized transmembrane topologies. Silico Biol. 1: http://www.bioinfo.de/isb/2001/2002/0003/.

Iverson, T.M., Luna-Chavez, C., Cecchini, G., and Rees, D.C. 1999. Structure of the E. coli fumarate reductase respiratory complex. Science 284: 1961.[Abstract/Free Full Text]

Iwata, S., Lee, J.W., Okada, K., Lee, J.K., Iwata, M., Rasmussen, B., Link, T.A., Ramaswamy, S., and Jap, B.K. 1998. Complete structure of the 11-subunit bovine mitochondrial cytochrome BC1 complex. Science 281: 64–71.[Abstract/Free Full Text]

Jayasinghe, S., Hristova, K., and White, S.H. 2001. Energetics, stability, and prediction of transmembrane helices. J. Mol. Biol. 312: 927–934.[CrossRef][Medline]

Jones, D.T. 1998. Do transmembrane protein superfolds exist? FEBS Lett. 423: 281–285.[CrossRef][Medline]

Jones, D.T., Taylor, W.R., and Thornton, J.M. 1994. A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochem. 33: 3038–3049.[CrossRef][Medline]

Kabsch, W. and Sander, C. 1983. Dictionary of protein secondary structure: Pattern recognition of hydrogen bonded and geometrical features. Biopolymers 22: 2577–2637.[CrossRef][Medline]

Kessel, A. and Ben-Tal, N. 2002. Free energy determinants of peptide association with lipid bilayers. In Peptide–lipid interactions (eds. S. Simon and T. McIntosh). Academic Press, San Diego, CA (in press).

Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E.L. 2001. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J. Mol. Biol. 305: 567–580.[CrossRef][Medline]

Kyte, J. and Doolittle, R.F. 1982. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157: 105–132.[CrossRef][Medline]

Lawson, E.Q., Sadler, A.J., Harmatz, D., Brandau, D.T., Micanovic, R., MacElroy, R.D., and Middaught, C.R. 1984. A simple experimental model for hydrophobic interactions in proteins. J. Biol. Chem. 259: 2910–2912.[Abstract/Free Full Text]

Levitt, M. 1976. A simplified representation of protein conformations for rapid simulation of protein folding. J. Mol. Biol. 104: 59–107.[CrossRef][Medline]

Liu, J. and Rost, B. 2001. Comparing function and structure between entire proteomes. Protein Sci. 10: 1970–1979.[Abstract/Free Full Text]

———. 2002. Target space for structural genomics revisited. Bioinformatics 18: 922–933.[Abstract/Free Full Text]

Möller, S., Kriventseva, E.V., and Apweiler, R. 2000. A collection of well characterised integral membrane proteins. Bioinformatics 16: 1159–1160.[Abstract/Free Full Text]

Möller, S., Croning, D.R., and Apweiler, R. 2001. Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 17: 646–653.[Abstract/Free Full Text]

Monne, M. and von Heijne, G. 2001. Effects of `hydrophobic mismatch' on the location of transmembrane helices in the ER membrane. FEBS Lett. 496: 96–100.[CrossRef][Medline]

Monne, M., Hermansson, M., and von Heijne, G. 1999a. A turn propensity scale for transmembrane helices. J. Mol. Biol. 288: 141–145.[CrossRef][Medline]

Monne, M., Nilsson, I., Elofsson, A., and von Heijne, G. 1999b. Turns in transmembrane helices: Determination of the minimal length of a "helical hairpin" and derivation of a fine-grained turn propensity scale. J. Mol. Biol. 293: 807–814.[CrossRef][Medline]

Nakashima, H., Nishikawa, K., and Ooi, T. 1990. Distinct character in hydrophobicity of amino acid composition of mitochondrial proteins. Proteins 8: 173–178.[CrossRef][Medline]

Pasquier, C., Promponas, V.J., Palaios, G.A., Hamodrakas, J.S., and Hamodrakas, S.J. 1999. A novel method for predicting transmembrane segments in proteins based on a statistical analysis of the SwissProt database: the PRED-TMR algorithm. Protein Eng. 12: 381–385.[Abstract/Free Full Text]

Persson, B. and Argos, P. 1996. Topology prediction of membrane proteins. Protein Sci. 5: 363–371.[Abstract]

Radzicka, A. and Wolfenden, R. 1988. Comparing the polarities of the amino acids: Side-chain distribution coefficients between the vapor phase, cyclohexane, 1-octanol, and neutral aqueous solution. Biochem. 27: 1664–1670.[CrossRef]

Roseman, M.A. 1988. Hydrophilicity of polar amino acid side-chains is markedly reduced by flanking peptide bonds. J. Mol. Biol. 200: 513–522.[CrossRef][Medline]

Rost, B. 1996. PHD: Predicting one-dimensional protein structure by profile based neural networks. Methods Enzymol. 266: 525–539.[CrossRef][Medline]

———. 1999. Twilight zone of protein sequence alignments. Protein Eng. 12: 85–94.[Abstract/Free Full Text]

———. 2001. Protein secondary structure prediction continues to rise. J. Struct. Biol. 134: 204–218.[Medline]

Rost, B., Casadio, R., Fariselli, P., and Sander, C. 1995. Prediction of helical transmembrane segments at 95% accuracy. Protein Sci. 4: 521–533.[Abstract]

Rost, B., Casadio, R., and Fariselli, P. 1996a. Refining neural network predictions for helical transmembrane proteins by dynamic programming. In Fourth International Conference on Intelligent Systems for Molecular Biology (eds. D. States), pp. 192–200. AAAI Press, St. Louis, MO, Menlo Park, CA.

———. 1996b. Topology prediction for helical transmembrane proteins at 86% accuracy. Protein Sci. 5: 1704–1718.[Abstract]

Sayle, R.A. and Milner-White, E.J. 1995. RASMOL: Biomolecular graphics for all. Trends Biochem. Sci. 20: 37.

Sipos, L. and von Heijne, G. 1993. Predicting the topology of eukaryotic membrane proteins. Eur. J. Biochem. 213: 1333–1340.[Medline]

Sonnhammer, E.L.L., von Heijne, G., and Krogh, A., 1998. A hidden Markov model for predicting transmembrane helices in protein sequences. In Sixth International Conference on Intelligent Systems for Molecular Biology (ISMB98) (eds. J. Glasgow), pp. 175–182. AAAI Press, Montreal, Canada.

Sweet, R.M. and Eisenberg, D. 1983. Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structure. J. Mol. Biol. 171: 479–488.[CrossRef][Medline]

Toyoshima, C., Nakasako, M., Nomura, H., and Ogawa, H. 2000. Crystal structure of the calcium pump of sarcoplasmic reticulum at 2.6 Ångstrøm resolution. Nature 405: 647.[CrossRef][Medline]

Tsukihara, T., Aoyama, H., Yamashita, E., Tomizaki, T., Yamaguchi, H., Shinzawa-Itoh, K., Nakashima, R., Yaono, R., and Yoshikawa, S. 1996. The whole structure of the 13-subunit oxidized cytochrome C oxidase at 2.8 Å. Science 272: 1136.[Abstract]

Tusnady, G.E. and Simon, I. 1998. Principles governing amino acid composition of integral membrane proteins: application to topology prediction. J. Mol. Biol. 283: 489–506.[CrossRef][Medline]

———. 2001. Topology of membrane proteins. J. Chem. Inf. Comput. Sci. 41: 364–368.[CrossRef][Medline]

von Heijne, G. 1992. Membrane protein structure prediction. J. Mol. Biol. 225: 487–494.[CrossRef][Medline]

———. 1996. Prediction of transmembrane protein topology. In Protein structure prediction (ed. M.J. E. Sternberg), pp. 101–110. Oxford Univ. Press., Oxford, UK.

von Heijne, G. and Blomberg, C. 1979. Trans-membrane translocation of proteins: The direct transfer model. Eur. J. Biochem. 97: 175–181.[CrossRef][Medline]

Wallin, E. and von Heijne, G. 1998. Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein Sci. 7: 1029–1038.[Abstract]

Wolfenden, R., Andersson, L., Cullis, P.M., and Southgate, C.C.B. 1981. Affinities of amino acid side chains for solvent water. Biochemistry 20: 849–855.[CrossRef][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
A. Kernytsky and B. Rost
Static benchmarking of membrane helix predictions
Nucleic Acids Res., July 1, 2003; 31(13): 3642 - 3644.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Chen, C. P.
Right arrow Articles by Rost, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Chen, C. P.
Right arrow Articles by Rost, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS