Protein Science
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Siepen, J. A.
Right arrow Articles by Westhead, D. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Siepen, J. A.
Right arrow Articles by Westhead, D. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Protein Science (2002), 11:1862-1866.
Copyright © 2002 The Protein Society

FOR THE RECORD

The fibril_one on-line database: Mutations, experimental conditions, and trends associated with amyloid fibril formation

Jennifer A. Siepen and David R. Westhead

School of Biochemistry and Molecular Biology, University of Leeds, Leeds, LS2 9JT, UK

Reprint requests to: David R. Westhead, School of Biochemistry and Molecular Biology, University of Leeds, Leeds, LS2 9JT, UK; e-mail: westhead{at}bmb.leeds.ac.uk; fax: 44-113-233-3167.

(RECEIVED February 5, 2002; FINAL REVISION April 10, 2002; ACCEPTED April 10, 2002)

Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.0204302.


    Abstract
 TOP
 Abstract
 Introduction
 The fibril_one database
 Discussion
 References
 
The association of amyloid fibril formation with a number of important diseases, and the extensive study of this process in vitro, has resulted in a large literature containing a vast amount of information about the fibril formation process. This includes mutations and experimental conditions that promote or protect against fibril formation. A database (fibril_one) was designed to hold information relating to the formation of fibrils. It was populated by extensive searches of the literature and other databases. A powerful World Wide Web query interface to the database was developed, enabling a simple and effective method to view amyloidogenic mutations associated with specific proteins. The Web interface was used to identify trends in the data. This revealed that mutations promoting fibril formation through altered folding tend to be associated with destabilization of the native fold. In particular, tendancies of mutations to disrupt the native secondary structure and packing in the hydrophobic core were discovered to be significant. Query access to the database is available freely on the World Wide Web at http://www.bioinformatics.leeds.ac.uk/group/online/fibril_one.

Keywords: Amyloidosis; protein misfolding; MySQL; pathogenic mutations; protein stability; Web interface


    Introduction
 TOP
 Abstract
 Introduction
 The fibril_one database
 Discussion
 References
 
Amyloid fibrils have been associated with a range of diseases including Alzheimer's disease, type II diabetes mellitus, and Spongiform Encephalopathies (Chiti et al. 1999). Over 20 proteins have been described as amyloidogenic, and despite large differences in their size, amino acid sequence, native structure, and function (Ramirez-Alvarado et al. 2000), all the amyloid fibrils are a helical array of ß-strands arranged perpendicular to the fibril long axis. They are of indefinite length, unbranched, 70–120 Å in diameter, and display a green birefringence when viewed in polarized light after Congo red staining (Sunde 1997).

Amyloid fibrils can form when conformational changes of a normally soluble protein lead to self-assembly into the fibrillar structure. The native protein is thought to partially or completely unfold to form a structure competent for self-assembly into amyloid fibrils (Kelly 1998). Amyloidosis correlates with mutations in the protein sequence, but these mutations occur in a number of locations within the sequence, and they do not generally change the structure of the native protein (e.g., lysozyme or transthyretin). It has been speculated that some mutations enhance the formation of an aggregation-prone non-native state (Hurle et al. 1994), and a decrease in protein stability may also lead to partial protein unfolding and aggregation into the fibrillar structure (Rochet provides a summary; Rochet and Lansbury 2000).

A great deal of in vitro work has focused on specific pathogenic mutations that lead to the formation of amyloid fibrils. This process can be reproduced in the laboratory by exposure of the protein to slightly denaturing conditions (Dobson 1999) such as altered pH or temperature. A vast amount of data is currently available in the literature concerning amyloid fibril formation. Organization of this into a suitable database allows trends to be identified in the data, leading to better insight into the fibril formation process.

Consequently, we have created the fibril_one database, which at present contains almost 250 mutations and 50 experimental conditions associated with 22 proteins, and is expanding continuously. Details of a number of proteins and associated diseases from the database can be found in Table 1Go. The fibril_one database contains mutations and experimental conditions associated with a number of different amyloidogenic proteins that lead to the formation of amyloid fibrils both in vivo and in vitro. The database was populated using information from the literature and mutation databases available on-line, such as the Alzheimer's Disease Mutation Database (http://www.alzforum.org). The database has an interactive user-friendly Web-based user interface (Fig. 1Go) that enables the user to look at mutations associated with specific proteins or amyloidgenic proteins in general.


View this table:
[in this window]
[in a new window]
 
Table 1. Proteins from the fibril_one database and associated amyloid-related diseases
 


View larger version (51K):
[in this window]
[in a new window]
 
Fig. 1. The on-line search form for the fibril_one database. A screen capture of the on-line search form for the fibril_one database. Parts of the form are labeled; these enable selection of specific proteins and residues associated with fibrillogenesis.

 
The purpose of this communication is to introduce the fibril_one database and its World Wide Web interface. This includes an SQL (structured query language)-based search engine that allows the user to query all aspects of the data. In addition, we report a number of trends observed in the fibril_one database that supports recent findings such as Hurle (1994) and Ramirez-Alvarado (2000).


    The fibril_one database
 TOP
 Abstract
 Introduction
 The fibril_one database
 Discussion
 References
 
The fibril_one database is a MySQL database server version 3.22.32 relational database with information relating to mutations, experimental conditions, diseases, and structural alterations associated with amyloid fibril formation in vivo and in vitro. The database was designed by Entity-Relationship (ER) modeling, and a full schema can be found on the Web site (address given in the database availability section).

The information in the database includes protein sequence and GENBANK (Benson et al. 2000), SWISSPROT (Bairoch and Apweiler 2000), and PDB (Berman et al. 2000) accession numbers; single and double pathogenic mutations; mutation names; disease names; secondary structure location of the mutated residue; effect of the mutation on the native structure, subunit interactions, and suggested mechanism of fibril formation; experimental conditions relating to fibril formation; data relating to both protective and causative factors for fibril formation; and full references for the data.

Additionally the Web interface to the database provides the actual secondary structure of the complete protein from DSSP (Kabsch and Sander 1983) when a three-dimensional structure is available, or predicted secondary structure from PHD (Rost and Sander 1994); otherwise, with the mutated residue highlighted; ClustalW (Thompson et al. 1994) multiple alignment of the protein with related family members, again with the mutated residue highlighted; and hydrophobicity of residues and degree of burial of the mutated residue in the hydrophobic core as indicated by solvent accessibility calculations or predictions using PHDacc (Rost and Sander 1994).

The on-line search form, with a number of the different query options labeled, is shown in Figure 1Go. The simplest query that can be made with the on-line search form is to look at mutations that are associated with a specific protein, for example, transthyretin. The first few letters of the full protein name are required in the "protein name" box. This can then be submitted using the "submit query" button, revealing 79 mutations, that are listed in a table. A series of links from the returned results page make it possible to look in more detail at each of the individual mutations that match the search query. For example, the secondary structure details include specific details of the secondary structure and a link to the DSSP (Kabsch and Sander 1983) or PHD predicted (Rost and Sander 1994) secondary structure alignment associated with each individual mutation, with the mutated residue highlighted. Additionally the output for the "conservation" option is a table displaying the matching mutations, details relating to the residue conservation, and a link to the ClustalW (Thompson et al. 1994) multiple alignment, with the mutated residue highlighted.

The query form also enables the user to look at specific trends in the data. For example, the user can easily select mutations in specific locations, for instance, in {alpha}-helices or ß-strands (shown in Fig. 1Go). Querying for mutations in ß-strands results in 32 matching mutations, again with a choice of output options in the results page. Using the summary output option, a series of bar charts make trends associated with the search query easy to interpret. Using such methods we have identified trends in the fibril_one database; these are presented in the Discussion section.

The on-line search engine described here is a user-friendly HTML (hypertext mark-up language) form that creates and submits a series of SQL statements to the server where they are executed for subsequent data retrieval. A series of CGI (Common Gateway Interface) scripts then processes the results of a search, and displays them as an interactive series of tables and bar charts that provide details relating to those mutations that match the search query.


    Discussion
 TOP
 Abstract
 Introduction
 The fibril_one database
 Discussion
 References
 
The on-line SQL-search engine was used to identify various trends in the fibril_one database. We were interested primarily in mutations leading to fibrillogenesis via altered folding (e.g., transthyretin), which is an intrinsic property of the protein concerned, rather than fibrillogenesis via altered cleavage (Amyloid precursor protein and Presenilin 1), which depends critically on the specificity of cleavage enzymes. The search queries reported below were all disease related (these constitute the majority of mutations in the database), and were limited to these cases using an option on the on-line search form. Below we report several important trends in the database. In all cases trends were tested for statistical significance using standard chi-squared tests for association (2Ennos 2000) to identify cases where counts of residues in particular categories (e.g., hydrophobic/hydrophilic) differed significantly between wild-type and mutant residue sets. The trends reported below were found to be significant at the 5% level, except where stated otherwise, or in those cases where the expected counts were too small for the test to be applicable (Wardlow 1999).

Figure 2AGo is concerned with amino acid secondary structure propensities. This demonstrates a general trend for the ß-strand propensity to decrease in fibrillogenic mutations. This is an unexpected result, particularly considering that fibrils have ß-sheet secondary structure. It suggests that fibrillogenic mutations may have a destabilizing effect on the protein native structure, rather than directly stabilizing the fibrillar state. A more informative view of this trend is seen in Figure 2BGo. Here the query was limited to wild-type residues located in ß-strands in the native structure. These constitute the majority of the residues from Figure 2AGo, and tend to be replaced by mutant residues with high {alpha}-helical or ß-turn propensities, thereby destabilizing the native ß-sheet structure. The same effect is observed with wild-type residues are located in {alpha}-helices, where replacements tend to have lower helix propensity. However, there was insufficient data to attach statistical significance to the results for wild-type residues in helices, so these are not presented here.



View larger version (67K):
[in this window]
[in a new window]
 
Fig. 2. Trends seen in the fibril_one database. Histograms representing trends in the fibril_one database. Z-axis values correspond to the number of mutations in the database matching the search query. In all cases the queries were limited to mutations associated with disease and fibrillogenesis via altered folding rather than altered cleavage. (A) Amino acid secondary structure propensities of residues occurring in fibrillogenic mutations. Residues are identified with the secondary structure class for which they have the highest propensity (Williams et al. 1987). (B) Amino acid secondary structure propensities of residues occurring in fibrillogenic mutations, where the wild-type residue is located in a ß-strand. Residues are identified with the secondary structure class for which they have the highest propensity (Williams et al. 1987). (C) The hydrophobicity of wild-type and replacement residues occurring in fibrillogenic mutations. (D) The hydrophobicity of residues occurring in fibrillogenic mutations, where the wild-type residue is hydrophobic. (E) The hydrophobicity of residues occurring in fibrillogenic mutations, where the wild-type residue is hydrophobic and located in a predicted low solvent accessibility position. (F) The hydrophobicity of residues occurring in fibrillogenic mutations associated with disease, where the wild-type residue is hydrophilic.

 
Figure 2CGo shows the hydrophobicity of wild-type and replacement residues that are associated with disease and fibrillogenesis by altered folding. Although there are some small differences between the wild-type and replacement residues (replacements tend to be more hydrophilic), the insight into the effect of the mutation is limited. There is certainly no overall tendency for replacement with hydrophobic residues that might promote aggregation.

The on-line search engine described in this paper also enables specific selection of the wild-type residue. Using this option it is possible to study transitions between residues in different categories (hydrophobic, hydrophilic, buried, exposed, etc.), which yield better insight into the process of fibril formation. For example, Figure 2DGo shows mutations in which the wild-type residue is hydrophobic. Transitions from these residues are often to hydrophillic replacements. A more informative view of this is shown in Figure 2EGo. This represents mutations where the wild-type residue is hydrophobic and occupies (DSSP; Kabsch and Sander 1985) a low solvent accessibility position (DSSP; Kabsch and Sander 1985). In this case, over half of the mutations with wild-type hydrophobic residues located in the core of the protein have hydrophilic replacement residues. Such replacements would reduce the stability of the native fold, and thereby promote formation of a partially folded intermediate, and subsequent assembly into fibrils.

In contrast, Figure 2FGo is concerned with mutations in which the wild-type residue is hydrophilic. Replacements are almost equally divided between hydrophilic and hydrophobic residues. Replacement of these hydrophilic residues with hydrophobics might reduce the stability of the fold, particularly if the hydrophilic wild-type residues were involved in hydrogen bonds, electrostatic interactions, or salt bridges, but it could equally promote aggregation.

We have described the fibril_one database and on-line SQL-based search engine, and used this to identify trends in the data associated with propensity of proteins to form amyloid fibrils. The trends identified do not show any general tendency to mutations that enhance either the ß-sheet propensity or the likelihood of aggregation through increased hydrophobicity. Rather, we find trends to mutations that destabilize the native protein structure. In the core, hydrophobic wild-type residues are often replaced with hydrophilic residues. In ß-strands replacement residues tend to have lower propensities for the native ß-structure. Insight gained from the database may not only be specific to mutations associated with fibrillogenesis but also diseases associated with protein misfolding in general, and further analysis of this would require a general database containing mutations associated with all misfolding diseases. Beyond the identification of the trends we hope that the database will be useful to those researching the process of amyloid formation in vitro and in vivo.

Database accessibility and availability
The Web interface, including the SQL-search form to the fibril_one database, is freely available at http://www.bioinformatics.leeds.ac.uk/group/online/fibril_one.


    Acknowledgments
 
We thank the MRC for sponsorship, Dr. Steven Pickering of Leeds University for his technical assistance, and Professor Nigel Hooper and Professor Sheena Radford of Leeds University for critical reading of the manuscript. We are also indebted to authors too numerous to cite individually here, whose published data are included in our database with full references.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.


    References
 TOP
 Abstract
 Introduction
 The fibril_one database
 Discussion
 References
 
Bairoch, A. and Apweiler, R. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res.28:45–48.[Abstract/Free Full Text]

Benson, D., Karsch-Mizrachi, I., Lipman, D., Ostell, J., Rapp, B., and Wheeler, D. 2000. GenBank. Nucleic Acids Res.28:15–18.[Abstract/Free Full Text]

Berman, H., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T., Weissig, H., Shindyalov, I., and Bourne, P. 2000. The Protein Data Bank. Nucleic Acids Res.28:235–242.[Abstract/Free Full Text]

Chiti, F., Webster, P., Taddei, N., Clark, A., Stefani, M., Ramponi, G., and Dobson, C. 1999. Designing conditions for in vitro formation of amyloid protofilaments and fibrils. Proc. Natl. Acad. Sci.96:3590–3594.[Abstract/Free Full Text]

Dobson, C. 1999. Protein misfolding, evolution and disease. Trends Biochem. Sci.24:329–333.[CrossRef][Medline]

Ennos, R. 2000. Statistical and data handling skills in biology. Pearson Education Limited, Essex, England.

Hurle, M., Helms, L., Li, L., Chan, W., and Wetzel, R. 1994. A role for destabilising amino acid replacements in light chain amyloidosis. Proc. Natl. Acad. Sci.91:5446–5450.[Abstract/Free Full Text]

Kabsch, W. and Sander, C. 1983. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers22:2577–2637.[CrossRef][Medline]

Kelly, J. 1998. The alternative conformations of amyloidogenic proteins and their multi-step assembly pathways. Curr. Opin. Struct. Biol.8:101–106.[CrossRef][Medline]

Ramirez-Alvarado, M., Merkel, J., and Regan, L. 2000. A systematic exploration of the influence of the protein stability on amyloid fibril formation in vitro. Proc. Natl. Acad. Sci.97:8979–8984.[Abstract/Free Full Text]

Rochet, J. and Lansbury, P. 2000. Amyloid fibrillogenesis: Themes and variations. Curr. Opin. Struct. Biol.10:60–68.[CrossRef][Medline]

Rost, B. and Sander, C. 1994. 1D secondary structure prediction through evolutionary profiles. Proteins20:257–276.

Sunde, M., Serpell, L., Bartlam, M., Fraser, P., Pepys, M., and Blake, C. 1997. Common core structure of amyloid fibrils by synchrotron X-ray diffraction. J. Mol. Biol.273:729–739.[CrossRef][Medline]

Thompson, J., Higgins, D., and Gibson, T. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res.22:4673–4680.[Abstract/Free Full Text]

Wardlow, A. 1999. Practical statistics for experimental biologists (2nd ed.). John Wiley & Sons, West Sussex, England.

Williams, R., Chang, A., Jurech, D., and Loughran, S. 1987. Secondary structure predictions and medium range interactions. Biochim. Biophys. Acta916:200–204.[CrossRef][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Protein Sci.Home page
S. Zibaee, O. S. Makin, M. Goedert, and L. C. Serpell
A simple algorithm locates beta-strands in the amyloid fibril core of {alpha}-synuclein, Abeta, and tau using the amino acid sequence alone
Protein Sci., May 1, 2007; 16(5): 906 - 918.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Siepen, J. A.
Right arrow Articles by Westhead, D. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Siepen, J. A.
Right arrow Articles by Westhead, D. R.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS