|
|
||||||||
School of Biochemistry and Molecular Biology, University of Leeds, Leeds, LS2 9JT, UK
Reprint requests to: David R. Westhead, School of Biochemistry and Molecular Biology, University of Leeds, Leeds, LS2 9JT, UK; e-mail: westhead{at}bmb.leeds.ac.uk; fax: 44-113-233-3167.
(RECEIVED February 5, 2002; FINAL REVISION April 10, 2002; ACCEPTED April 10, 2002)
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.0204302.
| Abstract |
|---|
|
|
|---|
Keywords: Amyloidosis; protein misfolding; MySQL; pathogenic mutations; protein stability; Web interface
| Introduction |
|---|
|
|
|---|
Amyloid fibrils can form when conformational changes of a normally soluble protein lead to self-assembly into the fibrillar structure. The native protein is thought to partially or completely unfold to form a structure competent for self-assembly into amyloid fibrils (Kelly 1998). Amyloidosis correlates with mutations in the protein sequence, but these mutations occur in a number of locations within the sequence, and they do not generally change the structure of the native protein (e.g., lysozyme or transthyretin). It has been speculated that some mutations enhance the formation of an aggregation-prone non-native state (Hurle et al. 1994), and a decrease in protein stability may also lead to partial protein unfolding and aggregation into the fibrillar structure (Rochet provides a summary; Rochet and Lansbury 2000).
A great deal of in vitro work has focused on specific pathogenic mutations that lead to the formation of amyloid fibrils. This process can be reproduced in the laboratory by exposure of the protein to slightly denaturing conditions (Dobson 1999) such as altered pH or temperature. A vast amount of data is currently available in the literature concerning amyloid fibril formation. Organization of this into a suitable database allows trends to be identified in the data, leading to better insight into the fibril formation process.
Consequently, we have created the fibril_one database, which at present contains almost 250 mutations and 50 experimental conditions associated with 22 proteins, and is expanding continuously. Details of a number of proteins and associated diseases from the database can be found in Table 1
. The fibril_one database contains mutations and experimental conditions associated with a number of different amyloidogenic proteins that lead to the formation of amyloid fibrils both in vivo and in vitro. The database was populated using information from the literature and mutation databases available on-line, such as the Alzheimer's Disease Mutation Database (http://www.alzforum.org). The database has an interactive user-friendly Web-based user interface (Fig. 1
) that enables the user to look at mutations associated with specific proteins or amyloidgenic proteins in general.
|
|
| The fibril_one database |
|---|
|
|
|---|
The information in the database includes protein sequence and GENBANK (Benson et al. 2000), SWISSPROT (Bairoch and Apweiler 2000), and PDB (Berman et al. 2000) accession numbers; single and double pathogenic mutations; mutation names; disease names; secondary structure location of the mutated residue; effect of the mutation on the native structure, subunit interactions, and suggested mechanism of fibril formation; experimental conditions relating to fibril formation; data relating to both protective and causative factors for fibril formation; and full references for the data.
Additionally the Web interface to the database provides the actual secondary structure of the complete protein from DSSP (Kabsch and Sander 1983) when a three-dimensional structure is available, or predicted secondary structure from PHD (Rost and Sander 1994); otherwise, with the mutated residue highlighted; ClustalW (Thompson et al. 1994) multiple alignment of the protein with related family members, again with the mutated residue highlighted; and hydrophobicity of residues and degree of burial of the mutated residue in the hydrophobic core as indicated by solvent accessibility calculations or predictions using PHDacc (Rost and Sander 1994).
The on-line search form, with a number of the different query options labeled, is shown in Figure 1
. The simplest query that can be made with the on-line search form is to look at mutations that are associated with a specific protein, for example, transthyretin. The first few letters of the full protein name are required in the "protein name" box. This can then be submitted using the "submit query" button, revealing 79 mutations, that are listed in a table. A series of links from the returned results page make it possible to look in more detail at each of the individual mutations that match the search query. For example, the secondary structure details include specific details of the secondary structure and a link to the DSSP (Kabsch and Sander 1983) or PHD predicted (Rost and Sander 1994) secondary structure alignment associated with each individual mutation, with the mutated residue highlighted. Additionally the output for the "conservation" option is a table displaying the matching mutations, details relating to the residue conservation, and a link to the ClustalW (Thompson et al. 1994) multiple alignment, with the mutated residue highlighted.
The query form also enables the user to look at specific trends in the data. For example, the user can easily select mutations in specific locations, for instance, in
-helices or ß-strands (shown in Fig. 1
). Querying for mutations in ß-strands results in 32 matching mutations, again with a choice of output options in the results page. Using the summary output option, a series of bar charts make trends associated with the search query easy to interpret. Using such methods we have identified trends in the fibril_one database; these are presented in the Discussion section.
The on-line search engine described here is a user-friendly HTML (hypertext mark-up language) form that creates and submits a series of SQL statements to the server where they are executed for subsequent data retrieval. A series of CGI (Common Gateway Interface) scripts then processes the results of a search, and displays them as an interactive series of tables and bar charts that provide details relating to those mutations that match the search query.
| Discussion |
|---|
|
|
|---|
Figure 2A
is concerned with amino acid secondary structure propensities. This demonstrates a general trend for the ß-strand propensity to decrease in fibrillogenic mutations. This is an unexpected result, particularly considering that fibrils have ß-sheet secondary structure. It suggests that fibrillogenic mutations may have a destabilizing effect on the protein native structure, rather than directly stabilizing the fibrillar state. A more informative view of this trend is seen in Figure 2B
. Here the query was limited to wild-type residues located in ß-strands in the native structure. These constitute the majority of the residues from Figure 2A
, and tend to be replaced by mutant residues with high
-helical or ß-turn propensities, thereby destabilizing the native ß-sheet structure. The same effect is observed with wild-type residues are located in
-helices, where replacements tend to have lower helix propensity. However, there was insufficient data to attach statistical significance to the results for wild-type residues in helices, so these are not presented here.
|
The on-line search engine described in this paper also enables specific selection of the wild-type residue. Using this option it is possible to study transitions between residues in different categories (hydrophobic, hydrophilic, buried, exposed, etc.), which yield better insight into the process of fibril formation. For example, Figure 2D
shows mutations in which the wild-type residue is hydrophobic. Transitions from these residues are often to hydrophillic replacements. A more informative view of this is shown in Figure 2E
. This represents mutations where the wild-type residue is hydrophobic and occupies (DSSP; Kabsch and Sander 1985) a low solvent accessibility position (DSSP; Kabsch and Sander 1985). In this case, over half of the mutations with wild-type hydrophobic residues located in the core of the protein have hydrophilic replacement residues. Such replacements would reduce the stability of the native fold, and thereby promote formation of a partially folded intermediate, and subsequent assembly into fibrils.
In contrast, Figure 2F
is concerned with mutations in which the wild-type residue is hydrophilic. Replacements are almost equally divided between hydrophilic and hydrophobic residues. Replacement of these hydrophilic residues with hydrophobics might reduce the stability of the fold, particularly if the hydrophilic wild-type residues were involved in hydrogen bonds, electrostatic interactions, or salt bridges, but it could equally promote aggregation.
We have described the fibril_one database and on-line SQL-based search engine, and used this to identify trends in the data associated with propensity of proteins to form amyloid fibrils. The trends identified do not show any general tendency to mutations that enhance either the ß-sheet propensity or the likelihood of aggregation through increased hydrophobicity. Rather, we find trends to mutations that destabilize the native protein structure. In the core, hydrophobic wild-type residues are often replaced with hydrophilic residues. In ß-strands replacement residues tend to have lower propensities for the native ß-structure. Insight gained from the database may not only be specific to mutations associated with fibrillogenesis but also diseases associated with protein misfolding in general, and further analysis of this would require a general database containing mutations associated with all misfolding diseases. Beyond the identification of the trends we hope that the database will be useful to those researching the process of amyloid formation in vitro and in vivo.
Database accessibility and availability
The Web interface, including the SQL-search form to the fibril_one database, is freely available at http://www.bioinformatics.leeds.ac.uk/group/online/fibril_one.
| Acknowledgments |
|---|
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| References |
|---|
|
|
|---|
Benson, D., Karsch-Mizrachi, I., Lipman, D., Ostell, J., Rapp, B., and Wheeler, D. 2000. GenBank. Nucleic Acids Res.28:1518.
Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shindyalov, I., and Bourne, P. 2000. The Protein Data Bank. Nucleic Acids Res.28:235242.
Chiti, F., Webster, P., Taddei, N., Clark, A., Stefani, M., Ramponi, G., and Dobson, C. 1999. Designing conditions for in vitro formation of amyloid protofilaments and fibrils. Proc. Natl. Acad. Sci.96:35903594.
Dobson, C. 1999. Protein misfolding, evolution and disease. Trends Biochem. Sci.24:329333.[CrossRef][Medline]
Ennos, R. 2000. Statistical and data handling skills in biology. Pearson Education Limited, Essex, England.
Hurle, M., Helms, L., Li, L., Chan, W., and Wetzel, R. 1994. A role for destabilising amino acid replacements in light chain amyloidosis. Proc. Natl. Acad. Sci.91:54465450.
Kabsch, W. and Sander, C. 1983. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers22:25772637.[CrossRef][Medline]
Kelly, J. 1998. The alternative conformations of amyloidogenic proteins and their multi-step assembly pathways. Curr. Opin. Struct. Biol.8:101106.[CrossRef][Medline]
Ramirez-Alvarado, M., Merkel, J., and Regan, L. 2000. A systematic exploration of the influence of the protein stability on amyloid fibril formation in vitro. Proc. Natl. Acad. Sci.97:89798984.
Rochet, J. and Lansbury, P. 2000. Amyloid fibrillogenesis: Themes and variations. Curr. Opin. Struct. Biol.10:6068.[CrossRef][Medline]
Rost, B. and Sander, C. 1994. 1D secondary structure prediction through evolutionary profiles. Proteins20:257276.
Sunde, M., Serpell, L., Bartlam, M., Fraser, P., Pepys, M., and Blake, C. 1997. Common core structure of amyloid fibrils by synchrotron X-ray diffraction. J. Mol. Biol.273:729739.[CrossRef][Medline]
Thompson, J., Higgins, D., and Gibson, T. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res.22:46734680.
Wardlow, A. 1999. Practical statistics for experimental biologists (2nd ed.). John Wiley & Sons, West Sussex, England.
Williams, R., Chang, A., Jurech, D., and Loughran, S. 1987. Secondary structure predictions and medium range interactions. Biochim. Biophys. Acta916:200204.[CrossRef][Medline]
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
S. Zibaee, O. S. Makin, M. Goedert, and L. C. Serpell A simple algorithm locates beta-strands in the amyloid fibril core of {alpha}-synuclein, Abeta, and tau using the amino acid sequence alone Protein Sci., May 1, 2007; 16(5): 906 - 918. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |