|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Chemistry, University at Albany, SUNY, Albany, New York 12222, USA
2 Department of Chemistry and Chemical Biology, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
(RECEIVED October 29, 2006; FINAL REVISION January 25, 2007; ACCEPTED February 4, 2007)
| Abstract |
|---|
|
|
|---|
Keywords: two-state transition; amyloid fibrils; chemometric analysis; deep UV resonance Raman spectroscopy; electrospray ionization mass spectrometry
| Introduction |
|---|
|
|
|---|
structure is undergoing extensive study (Kelly 1998a; Dobson 2003; Tycko 2004; Uversky and Fink 2004; Ban et al. 2006; Daggett 2006). The cross-
structure consists of extended parallel/antiparallel
-strands with 0.47-nm spacing arranged perpendicular to the long axis of the fibrils by the mating of side chains, which is termed steric zipper (Sunde and Blake 1998; Serpell 2000; Nelson et al. 2005). The
-strands are hydrogen-bonded into sheets that are running parallel to the axis of the fibrils in a twisted manner with 1.0-nm spacing, which is called protofilament (Tycko 2004; Eisenberg et al. 2006). Several protofilaments arrange laterally around a "hollow" center to form unbranched and straight fibrils, which are usually
10 nm in diameter with length ranging from hundreds of nanometers to tens of micrometers (Sunde and Blake 1998; Serpell 2000; Wetzel 2002). Although amyloid depositions are usually the major histological character for the neurodegenerative diseases, it has been postulated that a misfolded/partially unfolded intermediate of fibril formation might be the major culprit responsible for the adverse physiological effects (Bellotti et al. 2000; Blanch et al. 2000; Hardy and Selkoe 2002; Lashuel et al. 2002; Bucciantini et al. 2004). It is thought that an intermediate, which is devoid of well-characterized tertiary structure while retaining a certain level of secondary structure, is essential for the conformational changes that lead to the fibril formation (Bellotti et al. 2000; Goers et al. 2002; Cao et al. 2004; Uversky and Fink 2004; Surewicz et al. 2006). This partially unfolded intermediate shares many conformational properties with an early intermediate of protein folding, a premolten globule state. The intrinsic flexibility of the partially unfolded conformation is essential for the protein to initiate the structural rearrangement that leads to the fibril formation (Uversky and Fink 2004). Consequently, characterizing protein structural evolution at the early stages of fibrillation is important for understanding the overall biochemical mechanism of fibril formation as well as for developing new therapeutic strategies against the devastating diseases.
Many biophysical methods such as solution NMR (Chamberlain et al. 2001; Khurana et al. 2001; Canet et al. 2002), FT-IR (Khurana et al. 2001; Adams et al. 2002), CD (Goda et al. 2000; Khurana et al. 2001; McLaughlin et al. 2006), small-angle X-ray/neutron scattering (Hirai et al. 1999; Yonezawa et al. 2002), etc., have been utilized for studying the molecular mechanism of amyloid fibril formation. However, the insolubility of amyloid fibrils poses serious limitations to the application of these conventional techniques because of the light scattering nature of the samples. Raman spectroscopy is able to characterize such system because, by its nature, Raman effect is a light scattering phenomenon. Utilization of UV irradiation at
200 nm enables selective resonance enhancement of vibrational modes from amide chromophore (i.e., peptide bond), the building block of a polypeptide backbone. The technique, deep UV resonance Raman (DUVRR) spectroscopy, has been shown to be a powerful tool for structural characterization of proteins and polypeptides (Chi et al. 1998; Asher 2001; Lednev et al. 2005; Xu et al. 2005; JiJi et al. 2006). We have recently demonstrated that DUVRR spectroscopy can be used for probing structural evolution of an amyloidogenic protein at all stages of fibril formation (Xu et al. 2005).
Electrospray ionization mass spectrometry (ESI-MS) is a well established method for characterizing protein structural transition (Hooke et al. 1995; Konermann 2004; Kaltashov and Eyles 2005; Craig et al. 2006). The technique has also been applied in studying the amyloid fibrillation of amyloidogenic proteins, lysozyme in particular (Nettleton and Robinson, 2000; Caddy and Robinson 2006). The analysis of different ion charge state distribution (CSD) envelopes derived from different protein conformations is one of the methods for monitoring protein structural transition by ESI-MS (Chowdhury et al. 1990; Loo et al. 1991). ESI-MS has proven particularly useful for detection and characterization of folding/unfolding intermediates (Grandori 2002; Haq et al. 2005; Wintrode et al. 2005) and for kinetic studies of protein structural transitions (Konermann 2004; Furdui et al. 2006; Pan et al. 2006). It has been well documented that an unfolded protein molecule can accommodate more protons upon desolvation than can a compact native protein molecule because of the increased solvent-exposed surface area (Mohimen et al. 2003). In particular, various forms of disulphide-intact and disulphide-reduced lysozyme produced by electrospray ionization have been discriminated based on their CSD signatures (Valentine et al. 1997).
There are numerous studies on lysozyme structural changes using ESI-MS in conjunction with other techniques. For instance, ESI-MS combined with stop-flow fluorescence and circular dichroism was used in characterizing lysozyme refolding intermediates and determining the mechanism of refolding (Matagne et al. 1998; Canet et al. 1999; Kulkarni et al. 1999). Also, the role of native-like
-domain intermediate in controlling the kinetics of lysozyme refolding was demonstrated by using ESI-MS (Matagne et al. 2000). Further, the structure of the partially unfolded intermediate that is responsible for amyloid fibril formation has also been proposed based on NMR and mass spectroscopic studies of two variants of human lysozyme (Dumoulin et al. 2005).
Many proteins have been demonstrated to be able to form amyloid fibrils under certain destabilizing conditions (Uversky and Fink 2004). In the case of hen egg white lysozyme (herein refer to as lysozyme), conditions such as heating (Krebs et al. 2000; Arnaudov and De Vries 2005; Xu et al. 2005), chemical denaturation (Vernaglia et al. 2004), chemical denaturation and reduction (Cao et al. 2004; Gu et al. 2004), addition of alcohols (Hoshino et al. 1997; Goda et al. 2000; Cao et al. 2004), etc., have been reported to facilitate the formation of fibrils. Using DUVRR spectroscopy, we have recently shown that an irreversible unfolding of hen egg white lysozyme at 65°C and low pH is the first step of fibrillation (Xu et al. 2005). The partially unfolded intermediate has been concluded to retain certain level of
-helix structure (M. Xu, V.V. Ermolenkov, W. He, V.N. Uversky, L. Fredriksen, and I.K. Lednev, in prep.) However, the molecular mechanism of this irreversible transition is yet to be understood.
The lysozyme DUVRR spectroscopic signature comprises amide vibrational modes (Amide I, Amide II, Amide III, and C
-H bending) and bands arising from aromatic amino acid side chains. The amide vibrational modes provide information on the protein secondary structure, while phenylalanine, a natural DUVRR spectroscopic biomarker (Xu et al. 2005), reports on the tertiary structural evolution. Both secondary and tertiary structures have been found to evolve in a mono-exponential fashion with the same characteristic time of 28 ± 2 h during the irreversible partial unfolding of hen egg white lysozyme (Xu et al. 2005). The latter allowed us to postulate that the irreversible unfolding of lysozyme during fibril formation could be a two-state transition. Testing this hypothesis is the major goal of the study reported here.
Thermal melting of small globular protein can often be described as a two-state transition, i.e., a two-state, intramolecular, first-order reaction (Privalov 1979; Pfeil 1981; Jackson 1998; Dantas et al. 2003). Microcalorimetry study showed that the temperature induced melting of lysozyme is a two-state transition (Privalov and Khechinashvili 1974). Thermal unfolding of lysozyme has been well documented to be a reversible process (Velicelebi and Sturtevant 1979). Even the enzymatic activity of lysozyme is partially restored after consecutive heating and cooling (Velicelebi and Sturtevant 1979). Oliva and Mu
oz (2004) have recently developed a simple thermodynamic approach for discriminating between two-state and downhill folding (Yang and Gruebele 2004). Their approach is based on the combination of a structural probe of the local backbone conformation and two different denaturing procedures, such as chemical and thermal denaturation. It is not clear at the moment whether the chemical denaturation of lysozyme might result in the formation of the same intermediate product that formed during the prolonged incubation at elevated temperature and low pH. Therefore, we utilized a different approach that combines three spectroscopic/spectrometric techniques: DUVRR spectroscopy, intrinsic tryptophan fluorescence spectroscopy, and ESI-MS. In addition, chemometric analysis was used to retrieve quantitative statistical information from Raman, fluorescence spectroscopic, and mass spectrometric data. Although each of the three techniques was found to have certain limitations, their combined application allowed us to conclude definitively that the irreversible partial unfolding of lysozyme proceeds via a two-state transition.
| Results and Discussion |
|---|
|
|
|---|
6% of
-helical contribution as estimated from the lysozyme DUVRR (Xu et al. 2005) and far-UV CD data (M. Xu, V.V. Ermolenkov, W. He, V.N. Uversky, L. Fredriksen, and I.K. Lednev, in prep.). A gelatinous phase consisting of fibrils started to appear after a lag time of
2 d (Krebs et al. 2000). The solution and gelatinous parts of the samples were separated by centrifugation and characterized separately (Xu et al. 2005). Fluorescence, DUVRR spectroscopy, and ESI-MS were utilized for characterizing the structural composition of the soluble fractions of lysozyme samples incubated for various times. DUVRR spectra were measured from the soluble fractions as obtained, while the soluble fractions were diluted appropriately for tryptophan fluorescence and ESI-MS measurements (for details, see Materials and Methods). A comprehensive statistical analysis of ESI-MS, fluorescence, and DUVRR spectra was utilized for determining the evolution of lysozyme structural composition during the early stages of fibrillation. In particular, the number of significant (principal) components was established for all three experimental data sets using the chemometric approach. An unambiguous conclusion about the number of principal components in a data set is often difficult, if not impossible, because of random and systematic experimental errors (Malinowski 2002). The determination of the true number of significant factors for data with uncertainty is not a trivial task. No criterion for determining the number of factors is completely satisfactory when used alone (Malinowski 2002). Therefore, both empirical methods and methods requiring the knowledge of experimental error are normally utilized to draw a more reliable conclusion. The estimation of the experimental error in ESI mass spectra was not straightforward because of relatively large error of the CSD envelopes (see below). This significantly complicated the application of chemometric analysis for the ESI-MS data and therefore required additional experiments for estimating the experimental uncertainty. As opposed to ESI-MS, the experimental error estimated from the average noise amplitude in fluorescence and DUVRR spectra can be used easily as a criterion for the goodness of proposed models. Empirical chemometric methods such as Malinowski factor indicator function (IND), autocorrelation, cross-validation, etc. (see below), were appropriate for all three types of experimental data and showed consistent results for all three data sets. The combined application of various statistical criteria allowed us to make a solid conclusion about the validity of the two-component model from DUVRR, fluorescence, and ESI-MS data.
Lysozyme unfolding characterized by ESI-MS
The ESI mass spectra for the soluble fraction of lysozyme samples incubated for various times were recorded in the positively charged ion mode using an Agilent 1100 series LC/MSD-SL ion trap system. All the samples were measured under the same optimized instrumental condition in order to minimize the effect of mass discrimination from ion trap mass spectrometer. Three characteristic spectra of lysozyme obtained after the incubation for 0, 14, and 96 h are shown in Figure 1. The CSD envelope obtained for nonincubated lysozyme in pH 2.0 solution was very narrow (Fig. 1A) that indicated a compact conformation of the protein under acidic conditions, which was stabilized presumably by four disulfide bonds. The fraction of compact lysozyme in the 96-h incubated sample was estimated to be
4% according to the 30-h characteristic time of unfolding (Xu et al. 2005). In our experiments, the sensitivity of the ESI-MS detector to compact protein was about half of the sensitivity to the unordered lysozyme. Corrected for this difference in sensitivity, the spectral contribution of compact lysozyme to the ESI mass spectrum of 96-h incubated sample was
2%. This allowed us to treat the ESI mass spectrum of 96-h incubated lysozyme as the pure spectrum of the partially unfolded protein.
|
The unfolded conformer, on the other hand, is capable of accepting a larger number of protons (Miranker et al. 1996) forming peaks corresponding to [M+nH]n+ ions with n ranging from eight to 16. [M+13H]13+ ions produced the most abundant peak in the series (Fig. 1B,C). In addition to [M+nH]n+ ions, the ESI mass spectrum of the unfolded protein displayed peaks due to [M+nHxH2O]n+ (x = 1
2) and [M+nH+yH2O]n+ ions (Fig. 1B, inset). These peaks could come from the formation of water adducts. This process is thermodynamically favorable since the water molecules could efficiently solvate protein surface with an excessive charge (Hoaglund-Hyzer et al. 1999). However, other processes such as hydrolysis could also account for these peaks (see below). In our study, the most intense peaks of [M+nH+yH2O]n+ ions corresponded to mono-and di-solvates (Fig. 1B, inset). We attempted to suppress the formation of the solvates and reduce the dehydration process by varying the instrumental parameters, such as increasing nebulizing gas pressure, dry gas temperature, dry gas flow rate, or fragmentation voltage or changing the solvent flow. No change in peak intensities within a specific charge state group was found.
Also, a variety of factors may affect the results of the ESI-MS in studying protein conformational changes (Rockwood et al. 1991; Mirza and Chait 1997; Hearn et al. 2002; Samalikova et al. 2004). We tested the effect of spray voltage and other ion source settings (nebulizer gas pressure, dry gas pressure and temperature, cap exit voltage, spray voltage, etc.) on the mass spectra of both compact and the partially unfolded lysozyme. None of the experimental parameters were shown to induce either unfolding of the compact conformation or refolding of the partially unfolded conformation (data not shown).
Processes associated with lysozyme incubation at low pH and elevated temperature could be far more complicated than just protein unfolding. Chemical reactions such as deamination, acid hydrolysis, oxidation, disulfide bond scrambling, etc., could also take place. For instance, the [M+nH-xH2O]n+ ions could be the product of dehydrolysis or deamination (Kim et al. 2002). On the other hand, intact disulfide bridges can hold together the progeny chains formed due to the cleavage of peptide bonds. The latter would result in the formation of [M+nH+yH2O]n+ ions, where lysozyme underwent hydrolysis while the progeny chains were still connected by disulfide bonds. These "nicks" may hinder the ability of lysozyme to properly refold when the thermal stress was removed and thus could also account for the discovered irreversibility.
ESI mass spectra of incubated lysozyme showed the presence of a small amount of protein hydrolysis products starting from
4 h of incubation (data not shown). A fragment with molecular weight of 8644 Da was the major fragment in the early stage of the incubation. At longer incubation, 96 h in particular, other fragments also appeared (Fig. 1 B,C, the ion peaks derived from the fragments are in dashed lines, with underlined charge labels). The formation of protein fragments of
10 kDa and <6.5 kDa have been detected by using the nonreducing SDS-PAGE after prolonged incubation of hen egg white lysozyme at pH 2 and 57°C (Arnaudov and De Vries 2005). Nevertheless, Arnaudov and De Vries (2005) have concluded that lysozyme hydrolysis is not a prerequisite for fibril formation since no protein fragments were found in the case of prolonged incubation of lysozyme at pH 3 when fibrillar aggregates were also formed as detected by atomic force microscopy (AFM). At the same time, Frare et al. (2004) have recently reported that amyloid fibrils obtained after the incubation of hen egg white lysozyme at pH 2 and 65°C consist predominantly of fragments corresponding to residues 49100, 49101, 53100, and 53101. The fragments were formed due to the partial acid hydrolysis of Asp-X peptide bonds (Frare et al. 2004). The size of the fragments that we found in the soluble fraction of the incubated lysozyme was in good agreement with the findings of Frare et al. (2004). Hen egg white lysozyme is a globular protein consisting of 129 amino acid residues with a molecular weight of 14,305 Da. It contains four disulfide bonds: 6127, 6480, 7694, and 30115. The molecular weight of the fragment 49101 is 5699 Da (Frare et al. 2004). Therefore, the 8644-Da fragment could be assigned to segments 148 and 102129, which are bound by two disulfide bonds 6127 and 30115.
Pretreatment of ESI mass spectra
Two data sets of ESI mass spectra were acquired for the kinetics of lysozyme irreversible unfolding. The primary data set consisted of 21 ESI mass spectra corresponding to various incubation times (0, 5 min, 15 min, 30 min, 1 h, 2.020.0 h at 2-h intervals, 24 h, 30 h, 36 h, 48 h, 72 h, and 96 h). The selected incubation times covered all stages of lysozyme structural transformations from the compact conformation to the partially unfolded conformation according to the 30-h characteristic time of the unfolding (Xu et al. 2005). In order to study the early events of lysozyme unfolding, an additional 23 mass spectra (secondary data set) of lysozyme incubated for various durations (0, 5 min, 10 min, 15 min, 30 min, 45 min, 1 h, 1.54 h at 15-min intervals, 4.5 h, 5 h, 6 h, 12 h, and 24 h) were recorded. Both data sets were found to be in good agreement with each other. Since the first ESI-MS experiment covered a longer duration of the incubation, detailed data analysis was performed mainly on this data set.
The raw ESI mass spectra were deconvoluted using DataAnalysis software (version 5.2) supplied with the Agilent 1100 series LC/MSD-SL ion trap system. The ESI mass spectra of unfolded lysozyme revealed charged clusters of several evenly spaced peaks in the region of each expected charge state. (e.g., Fig. 1B, inset). The spacing of the peaks corresponded to
18 atomic weight units. Consequently, individual peaks in each cluster were assigned to the lysozyme bearing various numbers of water molecules (Rodriguez-Cruz et al. 1997) and to lysozyme with different degree of dehydration. Interestingly, the charged clusters were virtually absent in the ESI mass spectra of compact lysozyme but dominated the spectra of unfolded protein (Fig. 1A,C). This is reasonable because an unfolded protein is more likely to bind extra water molecules more tightly than a compact conformer would.
Two alternative approaches for calculating ESI-MS intensities of the charge states (from +8 to +17) were utilized: (1) The height of the most abundant peak in the cluster was used for representing the intensity of a particular charge state, and (2) all peaks in a cluster were integrated to find the intensity of the charge state (see Matlab code in Supporting Material). Obviously, the latter (hereinafter referred to as integrating approach) is more accurate unless charged clusters of lysozyme overlap with ion peaks originating from lysozyme fragments, which was the case in our study. The former approach (maximum peak approach) introduces systematic error by disregarding all minor peaks in each cluster, but at the same time, it is less affected by overlapping with extraneous peaks. All spectra were normalized to the maximum charge state intensities.
Chemometric analysis of ESI-MS data
Chemometric analysis of ESI-MS data has been shown to be a powerful tool for distinguishing conformers that evolve during protein structural transition (Mohimen et al. 2003; Navea et al. 2006). We utilized several chemometric approaches to analyze the ESI-MS data for characterizing the evolution of lysozyme structural composition during the early stages of fibrillation.
Abstract factor analysis
The data matrix comprising 21 ESI mass spectra of the incubated lysozyme (primary data set) was constructed for chemometric analysis. In addition, seven ESI mass spectra obtained for both compact and unfolded protein were averaged to obtain high-quality spectra of pure components. The ion peaks corresponding to the whole protein were used to construct the data matrix using the integrating approach described above, while the ion peaks derived from the fragments were ignored.
The result of abstract factor analysis (AFA) on the primary ESI-MS data set is shown in Table 1. The first two eigenvalues (EV) were appreciably larger than the EVs for other components. The EV for n = 3 is two orders of magnitude smaller than that for n = 2, whereas the EVs for n = 3, 4, 5 ... are about of the same order. This indicated that the principal components starting from n = 3 were error factors (Wise et al. 2005).
|
0.1% for the first two EVs and >10% for the rest ones, which was also indicative of two significant components. Autocorrelation coefficients (AUTOs) are normally close to one for smooth principal components and < 0.5 for noise factors (Malinowski 2002). In accordance with the proposed two-component model, only AUTOs for the two largest EVs turned out to be >0.5.
Cross-validation
The cross-validation method was utilized to double-check the two-component model suggested by AFA. Leave-one-out, Venetian blind, random subset, and contiguous block cross-validation methods were performed on both nonscaled and mean-centered data (Wise et al. 2005; Shashilov et al. 2006a). The curve of root mean square error of cross-validation (RMSECV) versus the number of principal component normally has a break or minimum against the true number of significant factors (Malinowski 2002; Wise et al. 2005). Factors that decrease RMSECV by <2% of the maximum RMSECV are usually rejected as error factors (Wise et al. 2005). Figure 2 shows the dependence of RMSECV versus the number of principal components for the nonscaled covariance matrix. The RMSECV curve had a break at n = 2, indicating that there were two principal components that contributed to the ESI mass spectra. Other methods of cross-validation on both nonscaled and mean-centered covariance matrices gave similar results (data not shown).
|
Estimating the relative sensitivity of ESI-MS to compact and partially unfolded protein
Higher sensitivity of ESI-MS toward unfolded protein was observed in our experiment. There are several factors, such as the averaged charge carried by the protein molecules (Pan and McLuckey 2003), and the settings of the instrument may influence the relative strength of ESI-MS signals from different conformers. It was essential to estimate the sensitivity ratio (
) in order to properly reconstruct the kinetics of lysozyme structural transition.
However, estimating the relative sensitivity of ESI-MS toward compact and unfolded protein was difficult because of some intrinsic limitations of the technique. As reviewed by Kaltashov and Eyles (2005), the relative conformer's ionic signal cannot be equated to its fractional concentration in solution. As a consequence, the spectral contributions of compact and unfolded protein obtained by the fitting of the data set with two basis spectra may not be appropriate in reconstructing the real kinetics of protein structural changes. Therefore, a calibration experiment is necessary to relate the spectral contributions of compact and unfolded proteins to their concentration fractions.
A priori two-component model
The set of samples with a priori two-component composition was prepared by mixing partially unfolded and compact lysozyme in different proportions for estimating the sensitivity of ESI-MS to the two conformers. This set of data was also used for evaluating the error of chemometric analysis on the ESI-MS data.
Given the ESI mass spectra of compact and partially unfolded lysozyme and the fractions of compact and unfolded lysozyme in each sample, all CSD envelopes of the a priori two-component model data set were fitted with the ESI mass spectra of compact and partially unfolded lysozyme using the simplex method (Press 2002). The simplex algorithm searched for the sensitivity ratio
that ensured both the best fitting of all spectra and the best agreement between the calculated and actual fractions of compact and unfolded protein used for sample preparation. Such approach allowed for extracting the relative sensitivity ratio
from the spectra constructed using the maximum peak approach (see The Pretreatment of ESI Mass Spectra). The treatment of ESI-MS data obtained by integrating over the charged clusters resulted in more accurate prediction of compact/unfolded proportions in each sample. However, using the experimental ESI mass spectra of unfolded lysozyme as a basis spectrum for the optimization procedure resulted in degraded fitting quality of ESI mass spectra for samples with larger fractions of compact conformations. This might be because the extraneous peaks of lysozyme fragments overlapping with the ionic peaks of the whole protein were not completely eliminated when integrating over the charged clusters of partially unfolded protein. As a result, the basis spectrum of partially unfolded lysozyme was determined with some uncertainty using the integration approach (Fig. 1F).
The real-coded genetic algorithm (Michalewicz 1996) was utilized to tackle this difficulty (see Matlab code in Supporting Material). During the genetic search, both sensitivity ratio (
) and the refined mass spectrum of partially unfolded lysozyme were determined. The program attempted to reach the best fit for all spectra in a two-component data set, providing that the fractions of compact and unfolded lysozyme found for each spectrum are close to actual proportions used for sample preparation. Using the refined ESI mass spectrum of partially unfolded lysozyme extracted by the genetic algorithm, the quality of the fitting was improved and thus allowed a more accurate estimation of the sensitivity ratio. The sensitivity ratio was estimated to be 0.52, i.e., ESI-MS was approximately two times more sensitive to the partially unfolded lysozyme than to the compact one in the primary data set for the instrumental settings/parameters used in the present work.
ESI-MS sensitivity estimated from kinetic profile
The kinetic profile of the compact protein resolved by the ALS analysis of the ESI-MS data set was used for estimating the relative sensitivity. The apparent fractions of the compact protein at each incubation time were calculated based on the fitting of the ESI mass spectra with two basis spectra, i.e., spectra of compact and partially unfolded lysozyme. The time dependence of the apparent fraction of the compact protein was then fitted with Equation 1:
|
|
where f is the apparent fraction of compact conformation; t, time of incubation;
, characteristic time (
30 h); and
, sensitivity of ESI-MS to the compact conformation. Assuming that the characteristic time (
) equals to 30 h, a value of
= 0.45 was obtained. This calculated relative sensitivity of ESI-MS for compact protein was consistent with the result obtained based on the analysis of the a priori two-component model (see above). Figure 3 shows the apparent fractions of the compact lysozyme calculated by chemometrics and the fitting curve based on Equation 1 for the primary data set.
|
. The algorithm searched for the refined spectrum of partially unfolded lysozyme, the sensitivity ratio (
), and the characteristic time (
) of lysozyme unfolding. The results of the fitting for the primary data set are shown in Figure 4. The characteristic time of lysozyme structural rearrangement was calculated to be 29.6 h; the sensitivity ratio, 0.46. The spectrum of partially unfolded lysozyme retrieved from the ESI-MS data set of the incubated lysozyme was close to that obtained by the fitting of the a priori two component data set.
|
Estimating the sensitivity threshold
Since the distribution of protein molecules bearing a certain charge has a statistical nature, the CSD envelopes obtained for the same sample from different acquisitions may not be identical. Consequently, any difference between two ESI mass spectra that do not exceed the estimated intrinsic error may not be regarded as relevant. This provides the method for estimating the sensitivity threshold for a particular ESI-MS experiment.
Seven ESI mass spectra recorded for the same sample of partially unfolded lysozyme were averaged. The average spectrum was subtracted from each spectrum of partially unfolded lysozyme to produce difference spectra. The fraction of experimental error was calculated as following:
|
|
where Dij and Sij are experimental and calculated intensity of the ith charge state in the jth CSD envelope, respectively. The fraction of experimental error defined by Equation 2 does not depend on normalization of CSD envelopes.
The calculated fraction f was regarded as the threshold value. In other words, if the contribution of a species to the ESI mass spectrum becomes lower than the threshold value, then the species can no longer be confidently detected by the technique. The horizontal dashed line in Figure 3 shows the threshold sensitivity, which was estimated to be
0.077 for the primary data set. As evident from the figure, the fraction of the compact sample became lower that the threshold value for the samples incubated for >60 h.
As seen from Figures 3 and 4, the data points corresponding to 48 and 72 h of incubation have the largest deviation from the modeling curves. This can be explained by both the intrinsic uncertainty of CSD envelopes and the larger spectral contribution from lysozyme fragments that have not been accounted for or assigned to either compact or partially unfolded conformation. Both the 48- and 72-h data points were calculated based on single experimental spectrum. On the contrary, the last (96-h) data point in Figures 3 and 4 was from the averaged spectrum based on seven independent acquisitions. This resulted in the statistically more reliable ESI mass spectrum and more accurate fitting of 96-h data point to the modeled kinetics curve.
Estimating experimental error for the ESI-MS data set
One of the major challenges associated with the fitting of mass spectra is a relatively large experimental error of the method. Discrepancy in relative abundances in spectra obtained from different acquisitions for the very same sample can be observed despite the fact experimental conditions were kept the same, especially for the unfolded samples (data not shown). The standard deviation in the set of ESI mass spectra obtained for the same sample can be used for estimating the standard error of the experiment. The root-mean-square-errors (RMSEs) were calculated according to Equation 3:
|
|
where Sijexp and Sijpred are experimental and calculated intensities of the ith charge state in the jth CSD envelope, respectively; n is the number of CSD envelopes; and m is the number of charge states in each envelope. The RMSE defined by Equation 3 does not depend on the normalization of CSD envelopes. The RMSEs were found to be 2.5 x 104 for the set of spectra from compact lysozyme, 1.2 x 104 for the set of spectra from partially unfolded lysozyme, and 5.8 x 103 for the fitting of the data matrix of mass spectra for incubated lysozyme. The calculated standard error of fitting was found to be of the same order as the experimental error.
The RMSE for the a priori two-component model (2.4 x 105) was similar to that of the incubated samples (2.9 x 105). If the residual error of the two-component fitting for the a priori two-component set can be accepted as experimental error (it should be since the spectra do represent the mixture of two species), the comparable error for the fitting of the data set from incubated samples should also be considered as experimental error. This conclusion verified additionally the validity of the two-state transition model. The RMSE of the fitting of the data set for the incubated samples is slightly larger than that for the a priori two-component data set (2.9 x 105 vs. 2.5 x 105). Additional error in the data set of the incubated samples might arise from lysozyme fragmentation (see above). In fact, lysozyme fragments that were not accounted for by the two-component model were present in both incubated and mixed samples. However, the fraction of fragments in the mixed samples was proportional to the fraction of partially unfolded protein since all samples were prepared using the same solution of partially unfolded protein. This means that the ESI mass spectra of partially unfolded lysozyme and its fragments were treated as a single principal component because of 100% correlation of the concentrations. This might not be the case for the incubated lysozyme since the fragmentation and the unfolding can occur at different rates, which introduces small additional error to the data set for the incubated lysozyme.
Analysis of the ESI-MS data on the early stage of lysozyme structural changes
All the steps of the quantitative analysis described above were repeated for the ESI-MS data set, reflecting the early stage of lysozyme unfolding (secondary data set). The data sets of 23 ESI mass spectra of the incubated lysozyme (within 24 h) and 23 ESI mass spectra of a priori mixtures of compact/unfolded lysozyme were analyzed using the chemometric approach. The results confirmed the proposed two-component model in the very early stage of the unfolding and suggested the mono-exponential kinetics for lysozyme structural changes with the characteristic time
30 h.
Factor analysis of charged clusters in ESI mass spectra of lysozyme
As evident from the ESI mass spectra (Fig. 1B,C), charged clusters are absent in the spectra of compact lysozyme but grow with incubation time. This observation suggests the correlation between the intensities of peaks in the charged clusters and the fraction of the unfolded protein in the sample. We performed AFA on the charged clusters corresponding to 8+, 9+, and 10+ charge states in both a priori two-component and incubated lysozyme data sets. The results revealed the presence of two significant components in both data sets (data not shown). However, the determination of the fractions of compact and unfolded protein in each sample by the fitting of the charged clusters was difficult since the shapes of the clusters are different even for the spectra recorded from the very same sample. We plan to elaborate a model that would best describe the cluster shapes and fit the clusters using the Bayesian statistical approach.
The results of ESI-MS experiments proved the two-state mechanism of lysozyme unfolding. Nevertheless, the quantitative characterization of the reaction kinetic based on ESI-MS data might not be accurate for several reasons. First, the fractions of compact and partially unfolded protein in the samples were determined by the indirect method using sensitivity ratio
that was also estimated with some uncertainty. Second, the characteristic time of lysozyme structural changes was evaluated by fitting the two-parameter function (Equation 1). Fitting of such types of functions is referred to as an ill-posed problem (Johnson 2000); i.e., small errors in experimental data can significantly affect the estimated parameters especially when they are correlated. To verify the obtained kinetic law of thermally induced lysozyme unfolding, we utilized the intrinsic tryptophan fluorescence and deep UV Raman spectroscopy.
Intrinsic tryptophan fluorescence spectra of incubated lysozyme: Chemometric analysis
Ten fluorescence spectra for nonincubated lysozyme and the soluble fraction of lysozyme samples incubated at 65°C for 5 min, 1 h, 4 h, 8 h, 16 h, 24 h, 2 d, 4 d, and 8 d were measured and used for chemometric analysis. Figure 5A shows selected fluorescence spectra of lysozyme (soluble fractions) incubated for various times. The spectra are dominated by the tryptophan fluorescence. Hen egg white lysozyme has six tryptophan residues: Trp 28, 62, 63, 108, 111, and 123. A bathochromic shift of the tryptophan fluorescence peak caused by the incubation can be assigned to the exposure of tryptophan side-chain(s) to a more hydrophilic environment due to the unfolding of lysozyme (Demchenko 1986; Chi and Asher 1998).
|
|
Figure 6 shows the dependence of RMSECV versus the number of principal components for the mean-centered covariance matrix (Wise et al. 2005). The RMSECV curve shows a break against two significant factors. Leave-one-out, Venetian blind, and random subset cross-validation algorithms all showed very similar dependence of RMSECV on the number of factors (data not shown). The application of the cross-validation methods for the fluorescence spectroscopic data set indicated the presence of only two significant components, which is consistent with the result of AFA analysis.
|
|
The fluorescence spectra in Figure 5A were normalized in order to show the bathochromic shift as lysozyme unfolded during the incubation. The normalization does not affect the result of factor analysis since the relative magnitude of the EVs remains the same as long as the fluorescence spectra of the two principal components are of comparable intensity (within the same order of magnitude).
DUVRR spectra of lsozyme: Changes with the incubation time and chemometric analysis
Ten 197-nm excited DUVRR spectra of the soluble fraction of lysozyme samples incubated for various times (nonincubated, 15 min, 45 min, 2 h, 6 h, 12 h, 24 h, 2 d, 4 d, and 8 d) were used for chemometric analysis. The Raman spectra (data not shown) of the gelatinous part of the samples separated by centrifugation were dominated by the DUVRR signature of
-sheet (Xu et al. 2005). Figure 8A shows selected DUVRR spectra of the soluble fraction of lysozyme samples. The major spectroscopic changes were evident for the C
-H bending mode and the 1000-cm1 phenylalanine band (
12 ring breathing vibrational mode [Matsuda et al. 2003], in which alternating carbon atoms move toward and away from the center of the benzene ring) (Ziegler and Albrecht 1979; Asher et al. 1986; Fodor et al. 1989). The increase in the intensity of C
-H band indicated the melting of
-helix and the formation of an unordered structure. A dramatic decrease in the intensity of 1000-cm1 phenylalanine band indicated the exposure of phenylalanine side-chains to water in the partially unfolded lysozyme (Xu et al. 2005).
|
|
EFA suggested that the contribution of partially unfolded lysozyme appeared in DUVRR spectra as early as 2 h after incubation, while the spectral contribution from the compact lysozyme disappears after 4-d incubation (Fig. 9). There was only one significant component remaining after 8-d incubation. It was therefore reasonable to assign this component to the spectral contribution of the partially unfolded lysozyme.
|
The evolution of the contribution from spectra of compact and partially unfolded lysozyme to both fluorescence and DUVRR spectra allowed for establishing the kinetics of protein unfolding based on these two data sets. The fraction of the compact conformer decreased mono-exponentially with the incubation time with the characteristic time of 30.7 ± 3.3 h for fluorescence spectroscopic data and 32.1 ± 2.9 h for DUVRR spectroscopic data, respectively. These characteristic times are in agreement with our previous estimation (Xu et al. 2005).
Strictly speaking, monitoring "pure" secondary structure changes and extracting the characteristic time of secondary structure evolution would require the elimination of aromatic bands from DUVRR spectra (see Fig. 8) since they normally report on the tertiary structure changes. The ability to fit both aromatic and amide regions of DUVRR spectra with two pure component spectra further confirmed a 100% correlation in secondary and tertiary structural changes.
It is important to note here that a small amount of a newly formed
-sheet has been observed in the soluble fraction of the incubated samples of lysozyme (Xu et al. 2005). Chemometric analysis of the Raman spectral data reported here did not identify a
-sheet-reach spectrum as a separate principal component. Instead, the
-sheet contribution was incorporated into the spectrum of the partially unfolded lysozyme. This means that the formations of
-sheet and unordered structures might be correlated. The development of statistical methods of analysis, which would allow for distinguishing species with a partially correlated behavior, is currently underway in our laboratory (Shashilov et al. 2006b).
Two-state transition
Two-state transition means that any other protein conformations, except the initial and final, exhibit a very short lifetime and do not contribute significantly to the composition of the system at any time. The proposed two-state reaction mechanism would be rejected if at least one set of spectra, DUVRR, fluorescence, or mass spectra obtained at various incubation times could not be represented as a linear combination of two basis spectra correspond to the initial and final species. In terms of chemometric analysis, the evidence of a third principal component would be inconsistent with the two-state mechanism. However, a good two-component fitting of the fluorescence or Raman spectra is not sufficient to ultimately prove the two-state mechanism. In fact, a mixture of the compact and partially unfolded protein molecules may result in the same Raman and fluorescence spectra as an ensemble of protein molecules that are unfolded in different degree in between the two terminal conformational states (the compact and partially unfolded conformations). Protein/polypeptide DUVRR spectra can be approximated as a linear combination of the "pure secondary structure spectra" (Chi et al. 1998). For example, DUVRR spectra of 21-amino-acid residue, mainly alanine peptide (AP) measured at various temperatures have been fit well with a liner combination of
-helix spectrum and a temperature-dependent random coil spectrum (Lednev et al. 1999). However, the DUVRR spectroscopic study of
-helix melting of the isotope labeled analog of AP revealed a gradual melting mechanism (Ianoul et al. 2002).
Unlike Raman and fluorescence spectroscopy, ESI-MS can definitively distinguish intermediate states with a different degree of unfolding from the mixture of compact and the partially unfolded conformation. This is because every intermediate state should result in a specific CSD envelope, which cannot be fitted with a sum of CSD envelopes obtained for two terminal conformations, i.e., the compact and the partially unfolded conformation (Dobo and Kaltashov 2001; Kaltashov and Eyles 2005). In other words, the CSD envelope of a partially folded protein cannot be fitted with a linear combination of CSD envelopes of completely folded and unfolded proteins. Therefore, if chemometric analysis of MS spectra reveals two principal components only, it should be a sufficient evidence for the two-state transition mechanism, although the difference in sensitivity for different conformations should be taken into account if the kinetics parameters were derived from ESI-MS data.
Compared with the timescale of the overall fibrillation process, the partially unfolded conformation forms rather quickly once the molecule overcomes a barrier associated with a certain bottleneck transformation (transition state). The contribution of the lysozyme molecules in the transition conformation(s), if any, is negligible. Therefore, the possibility of characterizing the transition state directly using the spectroscopic techniques utilized here is limited. Other methods, such as
-value approach (Fersht and Sato 2004; Raleigh and Plaxco 2005) could be used to probe the transition state, while multidimensional NMR could be used for the structural characterization of the partially unfolded lysozyme.
Conclusions
It has been postulated that a misfolded/partially unfolded intermediate is a prerequisite for the formation of fibrils (Bellotti et al. 2000; Goers et al. 2002; Cao et al. 2004; Uversky and Fink 2004; Surewicz et al. 2006). Irreversible partial unfolding of hen egg white lysozyme has been recently reported (Xu et al. 2005) to occur at the early stages of fibrillation at elevated temperature and low pH. This irreversible process differs significantly from a reversible thermal unfolding of lysozyme, resulting in relatively small changes at 65°C and pH 2. Earlier experiments (Xu et al. 2005) allowed us to hypothesize that this irreversible unfolding is a two-state transition. This hypothesis was tested here by a combination of three spectroscopic/spectrometric techniques: intrinsic tryptophan fluorescence, DUVRR spectroscopy, and ESI-MS, along with comprehensive chemometric analysis on the experimental data. Principal component analysis (PCA) on all data sets unanimously suggested the existence of only two principal components during the irreversible unfolding. ALS allowed determining the spectra of compact and partially unfolded lysozyme for each of the three spectroscopic/spectrometric methods. The relative contributions of compact and partially unfolded protein in each sample were also evaluated by ALS and genetic fitting. The evolution profiles for both conformers were determined by plotting the contributions versus the incubation times. The characteristic time of the irreversible unfolding was estimated to be
30 h, which is in good agreement with our previous report (Xu et al. 2005).
Two significant components dominating fluorescence and DUVRR spectra cannot be considered as sufficient evidence proving the proposed two-state transition, while two factors in the ESI-MS data set can give the definitive proof for the presence of only two conformers. However, because of the oblique relation between the charge state intensities and concentration fractions of protein conformers, caution should be taken when interpreting the kinetics parameters extracted from ESI-MS data. Other spectroscopic techniques, such as DUVRR and fluorescence spectroscopy, were shown to be able to extract kinetics parameters more accurately.
The unfolding process discussed herein is a relatively slow process (characteristic time of
30 h). All the spectroscopic measurements were carried out at room temperature. Therefore, the two-state model does not rule out the existence of short-lived structural intermediates formed in-between the compact and final partially unfolded conformations. The approach utilized here, i.e., the combination of several spectroscopic/spectrometric techniques with chemometric analysis, could also be used in protein folding studies to determine the number of folding intermediates.
| Materials and Methods |
|---|
|
|
|---|
14 mg/mL of lysozyme solution with pH 2.0 was prepared and incubated at 65°C for various durations (Krebs et al. 2000; Xu et al. 2005). The incubated solutions were centrifuged at 16,000 g for 30 min to separate the soluble fraction from the gelatinous fraction that is dominated by fibrils. The soluble fraction was directly used for Raman spectroscopic measurements and, with proper dilution, used for the tryptophan fluorescence spectroscopic (
0.01 mg/mL) and ESI-MS (1 mg/mL) measurements. All spectroscopic/spectrometric measurements were conducted at room temperature. Protein concentrations were determined by diluting the samples into 6 M guanidine solution and measuring the UV absorbance at 280 nm. A280nm(1 mg/mL) = 2.37 was used for calculating the concentration of lysozyme (Cao et al. 2004).
Fluorescence and absorption spectroscopy
Tryptophan fluorescence spectra were measured using a PerkinElmer LS 55 Luminescence Spectrometer. The excitation wavelength was 280 nm. Slits for both excitation and emission were 5.0 nm. UV absorption spectra were measured on a Hewlett-Packard 8452A diode array spectrophotometer. The standard error of the fluorescence spectra was calculated based on the average noise amplitude on the background level.
DUVRR spectroscopy
A home-built DUVRR spectroscopic apparatus was described in detail elsewhere (Lednev et al. 2005). The 197-nm fourth-harmonic generation of Indigo S laser system (Coherent, Inc.) was used for excitation. Raman scattering was dispersed and recorded using a homebuilt double monochromator coupled with a liquid-nitrogen-cooled CCD camera (Roper Scientific, Inc.). A rotating Suprasil NMR tube with a magnetic stirrer inside was used as a sample holder. The acquired Raman spectra were analyzed using GRAMS/AI software (Thermo Electron Corp.). Spectral contribution of water and quartz were numerically subtracted. Sodium trifluoroacetate was used as the internal standard for intensity normalization.
Electrospray ionization mass spectrometry
ESI mass spectra were obtained on an Agilent 1100 series LC/MSD-SL ion trap system. Samples were introduced into the ion source using the autosampler (solvent flow rate, 200 µL/min). The mobile phase consisted of 0.2% formic acid water solution. The data were collected in positively charged ion mode in a mass range from m/z 800 to m/z 2100. The i