|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Department of Biological Sciences/Computer Science, Markey Center for Structural Biology, The Bindley Bioscience Center, Purdue University, West Lafayette, Indiana 47907, USA
Reprint requests to: Daisuke Kihara, Department of Biological Sciences/Computer Science, Markey Center for Structural Biology, The Bindley Bioscience Center, Purdue University, Lilly Hall, West Lafayette, IN 47907, USA; e-mail: dkihara{at}purdue.edu; fax: (765) 496-1189.
(RECEIVED March 25, 2005; FINAL REVISION April 19, 2005; ACCEPTED April 19, 2005)
| Abstract |
|---|
|
|
|---|
-strands, but also for
-helices. The prediction accuracy of
-strands is lower if residues have a high RCO or a low RCO, which corresponds to the situation that a
-sheet is formed by
-strands from different chains in a protein complex. The reason why the current study draws the opposite conclusion from the previous studies is examined. The implication for protein folding is also discussed.
Keywords: secondary structure prediction; long-range interaction; residue contact order;
-strand formation
Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.051479505.
| Introduction |
|---|
|
|
|---|
-helices,
-strands, or others (often referred to as coils). Essentially in these methods, a local region with a high density of amino acids with a high propensity to a certain type of secondary structure is predicted to form that particular secondary structure type (Lim 1974; Chou and Fasman 1978a,b; Garnier et al. 1978). The current generation of prediction methods employ machine learning techniques such as neural networks (Rost and Sander 1994; Jones 1999; Petersen et al. 2000), hidden Markov models (Karplus et al. 1998; Lin et al. 2005), and support vector machines (Hua and Sun 2001; Ward et al. 2003; Guo et al. 2004), which try to capture local sequence patterns of known examples of secondary structures in an input multiple sequence alignment. Recently some prediction methods have extended their prediction capability from the conventional three-state prediction to more states, including 310 helices,
-bulges, and turns (Karchin et al. 2003; Kuang et al. 2004). Current best methods achieve a three-state per residue-based accuracy (the Q3 measure) of 75%80% (Rost 2001; Rost and Eyrich 2001; McGuffin and Jones 2003). In spite of gradual improvements made by modern methods including those mentioned above, there is still a margin of 10%15% left for further improvement to reach the upper limit of the prediction accuracy of ~90%. This upper limit of the prediction accuracy was estimated based on two observations: first, 5%15% of secondary structure points can differ between different X-ray structures and NMR models of the same protein; second, there is inconsistency of secondary structure assignments by different methods, e.g., DSSP (Kabsch and Sander 1983) and STRIDE (Frishman and Argos 1995), and also of their parameters (Levin 1997; Rost 2001).
It has been discussed that one of the main reasons for the limitation comes from long-range amino acid interactions, which may overwrite local sequence propensity of secondary structures, since most of the current methods assign a secondary structure to a window of a local segment and thus usually do not explicitly consider long-range interactions of amino acids. Indeed, we can easily find several concrete examples of secondary structures whose formation is influenced by long-range interactions (Minor and Kim 1996; Munoz et al. 1996). An interesting experiment was conducted by Minor and Kim (1996), where an 11-residue-long sequence changed its secondary structure according to its position in the global fold of protein G. It was also observed that small fragments of the same sequence are found in different secondary structures (Pan et al. 1999; Jacoboni et al. 2000; Zhou et al. 2000; Ikeda and Higo 2003).
In spite of the general consensus and discussion, so far there are not many systematic studies on the influence of long-range interactions on the prediction accuracy of secondary structures. Fiser et al. (1997) compared the accuracy of secondary structure prediction for residues with many long-range contacts and for the other residues and concluded that the role of long-range interactions in defining the secondary structures is overestimated. Pan et al. (1999) concluded that the current insufficient accuracy of secondary structure prediction may result from the limitation of the available database size. Therefore, interestingly, both of them concluded that the long-range interaction does not have a strong effect on the prediction accuracy of secondary structures. But in our opinion, their statements come from indirect observation of long-range effects and do not approach analysis of long-range interactions well enough. Crooks and Brenner (2004) showed that local sequence information is insufficient to determine secondary structure, implying indirectly that nonlocal interaction is important for secondary structure formation. There are some other benchmark reports of secondary structure prediction methods, but none of them mention the effect of long-range interactions (Levin 1997; Przybylski and Rost 2002; McGuffin and Jones 2003).
In the current study, we directly address the effect of long-range interaction on the accuracy of current secondary structure prediction methods and come to a different conclusion. We introduce the residue contact order (RCO), which describes the separation of contacting residues in terms of the position in the sequence, and examine the relationship between the RCO and the prediction accuracy on a large, nonhomologous data set of 2777 proteins. Unlike previous studies, we do find a negative correlation between high RCO and the prediction accuracy. Typically, mispredicted residues with a high RCO are those that interact with other residues in a different domain. Interestingly, the negative correlation was found not only for
-strands, but also for
-helices. For
-strands, the prediction accuracy is relatively low when residues have a high RCO or a low RCO, indicating that there are many cases when formation of
-strands is affected by long-range interactions.
| Results |
|---|
|
|
|---|
-helices can be better predicted than
-strands (e.g., cf. Q3a and Q3b), which is consistent with other studies (Rost 2001; Rost and Eyrich 2001; McGuffin and Jones 2003). Keep in mind that the benchmark proteins most probably include sequences that were used to train these programs, which would inflate the accuracy beyond that shown in the original publications. It is notable that ~60% of the proteins have at least one residue whose secondary structure is oppositely predicted (BADp). Comparing BADa and BADb, more residues in
-strands tend to be predicted oppositely than those in
-helices. This may indicate that some
-strands have a high propensity for
-helices, which were captured by the prediction methods.
|
|
|
|
-strands have slightly higher RCO values.
Figure 4
shows the prediction accuracy with respect to the number of contacts. Looking at the range of the number of contacts (X-axis) where there is a sufficient amount of data, say ~512,
-helices and
-strands are better predicted when they are well packed (i.e., higher number of contacts). Residues in the other conformations (turns, loops) tend to be better predicted when they have a rather smaller number of contacts. One of the reasons for this observation may be that the sequence patterns of
-helices and
-strands in the core and loops or turns in the surface of proteins are better learned due to the abundance of training data.
|
-strands, thus taking long-range interaction into account. We have also tried a slightly different definition of the RCO, where adjacent residues are not counted in the contacting residues, resulting in essentially the same negative correlation with prediction accuracy.
|
|
-strands, but also
-helices, is strongly influenced by high RCO interactions, which shows a clear contrast with residues in coils. To examine the statistical significance of this observation, we performed a
2 test using four categories of residues, both with a lower (50150) and a higher (150250) RCO, and for each of them, its secondary structure is either correctly or wrongly predicted. The null hypothesis that the two valuables are independent is clearly rejected for residues in
-helices and
-strands, with the
2 value of 59.0 and 80.8 for
-helices and
-strands, respectively. As for the coil regions, the null hypothesis still holds with the
2 value of 2.6 using the significance level of 5%.
Intriguingly, from Figure 5B
we can also see that
-strands with a very low RCO (say, <20) are also not well predicted. A very low RCO indicates that the residue has no interactions with others or has only very local interactions.
Several examples of wrongly predicted secondary structure segments are shown in Figure 6
. The first three figures, AC, show high RCO residues whose secondary structures are wrongly predicted. The first protein, 1d3gA, forms a TIM-barrel fold (Fig. 6A
). Eighteen residues at the interface of closing up the barrel have high RCO values, and the secondary structures of 44.4% of them are not correctly predicted. In the second example of 1a0i (Fig. 6B
), the N-terminal tail wraps around the middle part of the protein, with a
-strand in the tail forming two
-sheets (shown in yellow), and
-strands in the middle of the protein. Nine out of 11 residues in the
-sheets are wrongly predicted. 1aofA (Fig. 6C
) forms an eight-propeller fold (the bottom domain in the figure), and again residues at the interface of closing up the propeller have wrong secondary structure prediction. They include residues in an
-helix and a
-strand at the C terminus (shown in red). Figure 6D
is an example of a mispredicted
-strand with a high RCO value: The highlighted strand of residues 146152 in 1pmi is sandwiched between two
-strands that consist of distant residues, 266271 and 285297, respectively; thus, its strand formation is speculated to be assisted by hydrogen bonds from the two adjacent strands. In contrast to Figure 6D
, the last two examples, Figure 6, E and F
, show mispredicted strands with a low RCO value. The C-terminal
-strand in 1k3bC forms a
-sheet with another strand in a different chain, 1k3bB. Since the C-terminal strand is isolated from the core body of 1k3bC, its RCO value calculated within the chain C is very low (Fig. 6E
). Similarly, the highlighted strand in Figure 6F
is stabilized by forming a
-sheet with a strand from a different chain. Another type of observed mispredicted strand with a low RCO is a strand that forms a two-stranded
-sheet with adjacent strands.
|
| Discussion |
|---|
|
|
|---|
-helices and
-strands, which shows a clear difference from residues in coils (Fig. 5B
At this juncture, it would be appropriate to closely examine the different results between ours and Fisers report (Fiser et al. 1997). In their study, Fiser et al. prepared two types of data sets, "highly interacting residues" and "stabilization center residues." The former residues are those that have a higher number of long-range interactions (defined as interactions between residues separated by at least 10 residues), and the latter residues are defined as pairs of distant residues in contact, with some of their flanking residues also in contact. Therefore, Fiser et al. are looking at the residues that are well packed in the core of proteins, which corresponds to our results of the correlation of the number of contacts in residues and the prediction accuracy (Fig. 4
). We would also point out that their data set size at that time was much smaller (80 proteins) compared with the current study.
In contrast, the RCO that we have introduced differentiates the sequential distance of contacts, i.e., unlike the case in Fisers report, a residue contact between residues 10 and 25 is categorized differently from that of a contact between residues 10 and 200. In addition, our results show that residues of a very high RCO value, say >150, tend to have a low prediction accuracy on average, which are residues that interact with other residues in a different domain. So in this sense, we could say that our results do not contradict Fisers report; rather, we extended the analyses in a different way.
All the secondary prediction methods are parameterized (trained) with known examples of sequences for each secondary structure (this is especially true for neural network-based approaches). What our analyses imply is that the secondary structure in the core of proteins is already well learned by the methods due to the abundance of examples, but some secondary structures in domain interfaces do not share sequence patterns with those in cores, which is the reason that prediction fails. The formation of these secondary structures with a high RCO is assisted by long-range interactions, which override the local propensity to a different secondary structure. It is also interesting to see that some of the isolated
-strands that form a
-sheet with another
-strand in a different chain (and thus have a very low RCO when the single chain is considered) are also difficult to predict. In the same way, the formation of these
-strands with a low RCO is also assisted and stabilized by
-strands from another chain.
Having observed that there are secondary structures influenced by long-range interactions, how can we then overcome this limitation of the accuracy of secondary structure prediction? Probably one of the most possible and practical ways is to still use machine learning techniques, such as neural network or support vector machine, but to separately train a neural network with those sequences of secondary structures with a high RCO (and
-strands with a low RCO) that were thus previously mispredicted. Another possible approach is to consider tertiary structures of proteins in secondary structure prediction. Actually there are some attempts along this line that show some improvements in the accuracy (Ito et al. 1997; Meiler and Baker 2003). Since secondary structure prediction is a crucial step for threading (Skolnick and Kihara 2001; McGuffin and Jones 2003; Skolnick et al. 2004) and ab initio-type protein tertiary structure prediction methods (Kihara et al. 2001), it will be worthwhile to revisit secondary structure prediction methods, taking advantage of the recent exponential increase of both sequence and structure data of proteins.
Interestingly, since secondary structure prediction methods can capture the intrinsic secondary structure propensity in a sequence, they can sometimes predict a more dynamic aspect of protein structures. For example, regions of proteins that undergo conformational switches (Young et al. 1999) and secondary structures in folding intermediate conformations of a protein (Shiraki et al. 1995) can be predicted by secondary structure prediction methods. Along this line, we would also like to mention recent results that show that nonnative secondary structures in folding intermediates are captured by tertiary structure prediction methods: Formation of nonnative secondary structure is suggested for some proteins in protein folding simulation using a coarse-grained protein model by Liwo et al. (2005; Skolnick 2005). Another example is a recent folding simulation of the SH3 domain using a fragment assembly-based protein structure prediction method (Chikenji et al. 2004), which showed formation of nonnative
-helices consistent with a folding experiment at subzero temperatures (H. Kihara, pers. comm.). Of course molecular dynamics (MD) would be a natural tool to observe conformational transition. An example is provided by Ikeda and Higo (2003), who employed a simulation of the "chameleon" sequence in the MAT
2/MCM1/DNA complex that showed two local minima in its free energy landscape, which correspond to
-helical and
-strand conformations. Thus, not only rigorous MD simulations, but also other computational methods, are becoming capable of investigating folding intermediates and conformational changes of proteins. To conclude, we would like to emphasize that this will also be applied in bioinformatics-type approaches, which can take advantage of the growing number of available sequence or structure data.
Materials and methods
Data set
A total of 2777 nonredundant protein sequences were selected from the PDB database (Berman et al. 2000) with a sequence identity threshold value of 30%. Sequences were taken from the ATOM field of the PDB files. Sequences <50 residues long and those that had gaps were discarded. We used the secondary structure definition of the DSSP program (Kabsch and Sander 1983): Residues in H (
-helix), G (3/10 helix), and I (pi helix) are considered to be in helical conformation, and those in E (extended strand) and B (
-bridge) are in
-strand conformation.
Secondary structure prediction methods
Three secondary structure prediction methods, PSIPRED (Jones 1999), Jnet (Cuff and Barton 2000), and PREDATOR (Frishman and Argos 1996), were used in this study. PSIPRED uses a neural network that takes a multiple sequence alignment of a query sequence generated by PSI-BLAST as input (Altschul et al. 1997). It assigns a secondary structure (
-helix,
-strand, or coil) to a residue at the center of a size 15 sliding window. Jnet is another neural network-based approach, which uses a size 17 sliding window. It uses three different forms of multiple sequence alignments generated with homologous sequences retrieved by PSI-BLAST. PREDATOR uses FASTA (Pearson and Lipman 1988) for collecting sequences. An interesting feature of PREDATOR is that it tries to recognize potentially hydrogen-bonded residues in amino acid sequences using database- derived statistics on residue-type occurrences in different classes of
-bridges to delineate interacting
-strands. For all three methods, homologous sequences for multiple sequence alignments are collected from a sequence database that consists of SWISS-PROT, trEMBL (Boeckmann et al. 2003), and KEGG genes database (Kanehisa et al. 2004).
Residue contact order
The contact order (CO) is the average sequence separation between contacting residues in the native state of a protein, which is defined by Equation 1, below. This simple index for describing the complexity of protein topology is often used in the context of the protein folding rate (Plaxco et al. 1998; Zhou and Zhou 2002; Ivankov et al. 2003):
![]() | (1) |
Here
ij=1 when residue i and j are in contact, and 0 otherwise. Two residues are considered to be in contact if any pair of heavy atoms from each residue locate closer than a threshold value. We used 4.5 and 6 Å as the threshold values. L is the length of the protein; N is the total number of contacts in the protein.
The relative contact order is a normalized CO by the length (L) of the protein:
![]() | (2) |
Similarly, we define the residue contact order for residue i as the average contact order for the residue:
![]() | (3) |
Here n is the number of contacts between the ith residue and the others. Recently a similar value named "residue-wise" contact order (RWCO) was introduced (Kinjo and Nishikawa 2005), which is just the sum of the sequence separation of contacting residues (i.e., RWCOi=n x RCOi). Hence, RWCO has a higher correlation to the number of contacts than RCO does (Fig. 3B
). After RCO for individual residues are calculated, the average RCO values in a smoothing window is assigned to the center residue of the window.
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., and Bourne, P.E. 2000. The Protein Data Bank. Nucleic Acids Res. 28: 235242.
Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., ODonovan, C., Phan, I., et al. 2003. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31: 365370.
Chikenji, G., Fujitsuka, W., and Takada, S. 2004. Protein folding mechanisms and energy landscape of src SH3 domain studied by a structure prediction toolbox. Chem. Phys. 307: 157162.[CrossRef]
Chou, P.Y. and Fasman, G.D. 1978a. Empirical predictions of protein conformation. Annu. Rev. Biochem. 47: 251276.[CrossRef][Medline]
. 1978b. Prediction of the secondary structure of proteins from their amino acid sequence. Adv. Enzymol. Relat. Areas Mol. Biol. 47: 45148.[Medline]
Crooks, G.E. and Brenner, S.E. 2004. Protein secondary structure: Entropy, correlations and prediction. Bioinformatics 20: 16031611.
Cuff, J.A. and Barton, G.J. 2000. Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40: 502511.[CrossRef][Medline]
Delano, W.L. 2002. The PyMOL Molecular Graphics System. http://www.pymol.org.
Fiser, A., Dosztanyi, Z., and Simon, I. 1997. The role of long-range interactions in defining the secondary structure of proteins is overestimated. Comput. Appl. Biosci. 13: 297301.
Frishman, D. and Argos, P. 1995. Knowledge-based protein secondary structure assignment. Proteins 23: 566579.[CrossRef][Medline]
. 1996. Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Protein Eng. 9: 133142.
Garnier, J., Osguthorpe, D.J., and Robson, B. 1978. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120: 97120.[CrossRef][Medline]
Guo, J., Chen, H., Sun, Z., and Lin, Y. 2004. A novel method for protein secondary structure prediction using dual-layer SVM and profiles. Proteins 54: 738743.[CrossRef][Medline]
Hua, S. and Sun, Z. 2001. A novel method of protein secondary structure prediction with high segment overlap measure: Support vector machine approach. J. Mol. Biol. 308: 397407.[CrossRef][Medline]
Ikeda, K. and Higo, J. 2003. Free-energy landscape of a chameleon sequence in explicit water and its inherent
/
bifacial property. Protein Sci. 12: 25422548.
Ito, M., Matsuo, Y., and Nishikawa, K. 1997. Prediction of protein secondary structure using the 3D-1D compatibility algorithm. Comput. Appl. Biosci. 13: 415424.
Ivankov, D.N., Garbuzynskiy, S.O., Alm, E., Plaxco, K.W., Baker, D., and Finkelstein, A.V. 2003. Contact order revisited: Influence of protein size on the folding rate. Protein Sci. 12: 20572062.
Jacoboni, I., Martelli, P.L., Fariselli, P., Compiani, M., and Casadio, R. 2000. Predictions of protein segments with the same amino acid sequence and different secondary structure: A benchmark for predictive methods. Proteins 41: 535544.[CrossRef][Medline]
Jones, D.T. 1999. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol. 292: 195202.[CrossRef][Medline]
Kabsch, W. and Sander, C. 1983. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 22: 25772637.[CrossRef][Medline]
Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., and Hattori, M. 2004. The KEGG resource for deciphering the genome. Nucleic Acids Res. 32 Database issue: D277D280.
Karchin, R., Cline, M., Mandel-Gutfreund, Y., and Karplus, K. 2003. Hidden Markov models that use predicted local structure for fold recognition: Alphabets of backbone geometry. Proteins 51: 504514.[CrossRef][Medline]
Karplus, K., Barrett, C., and Hughey, R. 1998. Hidden Markov models for detecting remote protein homologies. Bioinformatics 14: 846856.
Kihara, D., Lu, H., Kolinski, A., and Skolnick, J. 2001. TOUCHSTONE: An ab initio protein structure prediction method that uses threading-based tertiary restraints. Proc. Natl. Acad. Sci. 98: 1012510130.
Kinjo, A.R. and Nishikawa, K. 2005. Recoverable one-dimensional encoding of three-dimensional protein structures. Bioinformatics 21: 21672170.
Kuang, R., Leslie, C.S., and Yang, A.S. 2004. Protein backbone angle prediction with machine learning approaches. Bioinformatics 20: 16121621.
Levin, J.M. 1997. Exploring the limits of nearest neighbour secondary structure prediction. Protein Eng. 10: 771776.
Lim, V.I. 1974. Structural principles of the globular organization of protein chains. A stereochemical theory of globular protein secondary structure. J. Mol. Biol. 88: 857872.[CrossRef][Medline]
Lin, K., Simossis, V.A., Taylor, W.R., and Heringa, J. 2005. A simple and fast secondary structure prediction method using hidden neural networks. Bioinformatics 21: 152159.
Liwo, A., Khalili, M., and Scheraga, H.A. 2005. Ab initio simulations of protein-folding pathways by molecular dynamics with the united-residue model of polypeptide chains. Proc. Natl. Acad. Sci. 102: 23622367.
Matthews, B.W. 1975. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta 405: 442451.[Medline]
McGuffin, L.J. and Jones, D.T. 2003. Benchmarking secondary structure prediction for fold recognition. Proteins 52: 166175.[CrossRef][Medline]
Meiler, J. and Baker, D. 2003. Coupled prediction of protein secondary and tertiary structure. Proc. Natl. Acad. Sci. 100: 1210512110.
Minor Jr., D.L. and Kim, P.S. 1996. Context-dependent secondary structure formation of a designed protein sequence. Nature 380: 730734.[CrossRef][Medline]
Munoz, V., Cronet, P., Lopez-Hernandez, E., and Serrano, L. 1996. Analysis of the effect of local interactions on protein stability. Fold. Des. 1: 167178.[CrossRef][Medline]
Nishikawa, K. and Ooi, T. 1980. Prediction of the surface-interior diagram of globular proteins by an empirical method. Int. J. Pept. Protein Res. 16: 1932.[Medline]
Pan, X.M., Niu, W.D., and Wang, Z.X. 1999. What is the minimum number of residues to determine the secondary structural state? J. Protein Chem. 18: 579584.[CrossRef][Medline]
Pearson, W.R. and Lipman, D.J. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. 85: 24442448.
Petersen, T.N., Lundegaard, C., Nielsen, M., Bohr, H., Bohr, J., Brunak, S., Gippert, G.P., and Lund, O. 2000. Prediction of protein secondary structure at 80% accuracy. Proteins 41: 1720.[CrossRef][Medline]
Plaxco, K.W., Simons, K.T., and Baker, D. 1998. Contact order, transition state placement and the refolding rates of single domain proteins. J. Mol. Biol. 277: 985994.[CrossRef][Medline]
Przybylski, D. and Rost, B. 2002. Alignments grow, secondary structure prediction improves. Proteins 46: 197205.[CrossRef][Medline]
Rost, B. 2001. Review: Protein secondary structure prediction continues to rise. J. Struct. Biol. 134: 204218.[Medline]
Rost, B. and Eyrich, V.A. 2001. EVA: Large-scale analysis of secondary structure prediction. Proteins (Suppl.) 5: 192199.
Rost, B. and Sander, C. 1994. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19: 5572.[CrossRef][Medline]
. 2000. Third generation prediction of secondary structure. In Protein structure prediction (ed. B. Webster), pp 7195. Humana Press, Clifton, NJ.
Schulz, G.E. and Shirmer, R.H. 1979. Principles of protein structure. Springer-Verlag, New York.
Shiraki, K., Nishikawa, K., and Goto, Y. 1995. Trifluoroethanol-induced stabilization of the
-helical structure of
-lactoglobulin: Implication for non-hierarchical protein folding. J. Mol. Biol. 245: 180194.[CrossRef][Medline]
Skolnick, J. 2005. Putting the pathway back into protein folding. Proc. Natl. Acad. Sci. 102: 22652266.
Skolnick, J. and Kihara, D. 2001. Defrosting the frozen approximation: PROSPECTORA new approach to threading. Proteins 42: 319331.[CrossRef][Medline]
Skolnick, J., Kihara, D., and Zhang, Y. 2004. Development and large scale benchmark testing of the PROSPECTOR 3.0 threading algorithm. Proteins 56: 502518.[CrossRef][Medline]
Ward, J.J., McGuffin, L.J., Buxton, B.F., and Jones, D.T. 2003. Secondary structure prediction with support vector machines. Bioinformatics 19: 16501655.
Young, M., Kirshenbaum, K., Dill, K.A., and Highsmith, S. 1999. Predicting conformational switches in proteins. Protein Sci. 8: 17521764.[Abstract]
Zemla, A., Venclovas, C., Fidelis, K., and Rost, B. 1999. A modified definition of Sov, a segment-based measure for protein secondary structure prediction assessment. Proteins 34: 220223.[CrossRef][Medline]
Zhou, H. and Zhou, Y. 2002. Folding rate prediction using total contact distance. Biophys. J. 82: 458463.
Zhou, X., Alber, F., Folkers, G., Gonnet, G.H., and Chelvanayagam, G. 2000. An analysis of the helix-to-strand transition between peptides with identical sequence. Proteins 41: 248256.[CrossRef][Medline]
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
H. Cheng, T. Z. Sen, R. L. Jernigan, and A. Kloczkowski Consensus Data Mining (CDM) Protein Secondary Structure Prediction Server: Combining GOR V and Fragment Database Mining (FDM) Bioinformatics, October 1, 2007; 23(19): 2628 - 2630. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Zimmermann and U. H. E. Hansmann Support vector machines for prediction of dihedral angle regions Bioinformatics, December 15, 2006; 22(24): 3009 - 3015. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Z. Sen, H. Cheng, A. Kloczkowski, and R. L. Jernigan A Consensus Data Mining secondary structure prediction by combining GOR V and Fragment Database Mining Protein Sci., November 1, 2006; 15(11): 2499 - 2506. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Fleming, H. Gong, and G. D. Rose Secondary structure determines protein topology Protein Sci., August 1, 2006; 15(8): 1829 - 1834. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Gong, P. J. Fleming, and G. D. Rose From The Cover: Building native protein conformation from highly approximate backbone torsion angles PNAS, November 8, 2005; 102(45): 16227 - 16232. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||