|
|
||||||||
Institute for Biological Instrumentation, Russian Academy of Sciences, 142292 Pushchino, Moscow Region, Russia; Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064, USA
Reprint requests to: Vladimir N. Uversky, Department of Chemistry and Biochemistry, University of California, Santa Cruz, CA 95064, USA; e-mail: uversky{at}hydrogen.ucsc.edu; fax: (831) 459-2935.
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.4210102.
Abstract
The experimental material accumulated in the literature on the conformational behavior of intrinsically unstructured (natively unfolded) proteins was analyzed. Results of this analysis showed that these proteins do not possess uniform structural properties, as expected for members of a single thermodynamic entity. Rather, these proteins may be divided into two structurally different groups: intrinsic coils, and premolten globules. Proteins from the first group have hydrodynamic dimensions typical of random coils in poor solvent and do not possess any (or almost any) ordered secondary structure. Proteins from the second group are essentially more compact, exhibiting some amount of residual secondary structure, although they are still less dense than native or molten globule proteins. An important feature of the intrinsically unstructured proteins is that they undergo disorderorder transition during or prior to their biological function. In this respect, the Protein Quartet model, with function arising from four specific conformations (ordered forms, molten globules, premolten globules, and random coils) and transitions between any two of the states, is discussed.
Keywords: Intrinsically unfolded protein; intrinsically disordered protein; unfolded protein; molten globule; premolten globule; partially folded intermediate; random coil; conformational transition
Abbreviations: NMR, nuclear magnetic resonance CD, circular dichroism UV, ultraviolet ORD, optical rotatory dispersion FTIR, Fourier-transform infrared SAXS, small-angle X-ray scattering SANS, small-angle neutron scattering FRET, fluorescence resonance energy transfer N, native MG, molten globule PMG, premolten globule U, unfolded NU, natively unfolded NUcoil, natively unfolded proteins with coil-like properties NUPMG, natively unfolded proteins with PMG-like properties
This review introduces an intriguing protein family of natively unfolded proteins, whose existence questions one of the cornerstones in protein biology, chemistry and physics, that is, the structurefunction paradigm. This concept claims that a specific function of a protein is determined by its unique and rigid three-dimensional (3D) structure. This idea, formulated more than 100 years ago as a lock-and-key model for explaining the amazing specificity of the enzymatic hydrolysis of glucosides (Fischer 1894), proved to be extremely fruitful. Figuratively speaking, the protein structurefunction paradigm may be considered as the big bang, creating the universe of modern protein science. Figure 1
attempts to illustrate the most obvious scientific consequences of this concept.
|
Proteomics versus protein structurefunction paradigm
The application of neuronal network predictors for protein disorder using primary sequence information to the Swiss Protein Database has predicted that more than 15,000 proteins may contain disordered regions of at least 40 consecutive amino acid residues, with more than 1050 of them having high scores indicating disorder (Dunker et al. 1998; Romero et al. 1998b). This observation helped to conclude that "a large portion of gene sequences appear to code not for folded, globular proteins, but for long stretches of amino acids that are likely to be either unfolded in solution or adopt non-globular structures of unknown conformation. . . . The high proportion of gene sequences in the genomes of all organisms argues for important, as yet unknown functions, since there could be no other reason for their persistence throughout evolution" (Wright and Dyson 1999). Intriguingly, recent predictions on 29 genomes have established that proteins from eucaryotes have more intrinsic disorder than those from bacteria and archaea, with more than 30% of eucaryotic proteins having disordered regions greater than 50 consecutive residues (Dunker et al. 2001).
Protein self-organization and protein structurefunction paradigm
There is a rapidly growing set of proteins, which have been shown to be disordered or have profound disordered regions under physiological conditions. Several experimental approaches sensitive to the intrinsic disorder of a given protein or its part have been used to provide this evidence based on studies of protein self-organization problems.
It is known that the unique 3D structure of a globular protein is stabilized by noncovalent interactions (conformational forces) of different natures, such as hydrogen bonds, hydrophobic forces, van der Vaals interactions, etc. It was established long ago that high concentrations of strong denaturants (such as urea or guanidinium chloride [GdmCl]) lead to the complete disruption of all these interactions and, as a consequence, to the transformation of an initially folded protein molecule into a highly disordered random coil (Anson and Mirsky 1932; Mirsky and Pauling 1936; Neurath et al. 1944; Tanford 1968). In other words, such conditions cause the complete unfolding of proteins. However, sometimes changes in the environment can reduce (or even completely shut down) part of the conformational interactions, while the rest remain unchanged (or be even intensified). In these cases proteins will usually lose their biological activity, that is, they will be denatured. Denaturation is not necessarily accompanied by the complete unfolding of a protein, but rather results in the appearance of new conformations with properties intermediate between those of the native and the completely unfolded states.
It is known that globular proteins may exist in at least four different conformations: native (ordered), molten globule, premolten globule, and unfolded (Uversky and Ptitsyn 1994, 1996a; Ptitsyn 1995; Uversky 1997, 1998). The structural properties of the molten globule are well known, and have been systematized in a number of reviews (e.g., Ptitsyn 1995). It has been established that the protein molecule in this intermediate state has no (or has only a trace of) rigid cooperatively melted tertiary structure, that is, it is denatured. Small-angle X-ray scattering showed that the protein molecule in this intermediate state has a globular structure typical of native globular proteins (Eleiser et al. 1993; Kataoka et al. 1993, 1997; Semisotnov et al. 1996; Uversky et al. 1998). 2D NMR coupled with hydrogen-deuterium exchange showed that the protein molecule in the molten globule state is characterized not only by the native-like secondary structure content, but also by the native-like folding pattern (Baum et al. 1989; Bushnel et al. 1990; Jeng et al. 1990; Chyan et al. 1993; Wu et al. 1993; Eliezer et al. 1998; Bose et al. 1999; Bracken 2001). A considerable increase in the accessibility of a protein molecule to proteases was noted as a specific property of the molten globule state (Merrill et al. 1990; Fontana et al. 1993). It was also shown that transformation into this intermediate state is accompanied by a considerable increase in the affinity of a protein molecule to the hydrophobic fluorescence probes (such as 8-anilinonaphthalene-1-sulfonate, ANS) and this behavior should be considered as a characteristic property of the molten globule state (Semisotnov et al. 1991; Uversky et al. 1996). Finally, it was established that the averaged value for the increase in the hydrodynamic radius in the molten globule state compared with the native state is no more than 15%, which corresponds to volume increase of
50%.
The structural peculiarities of a polypeptide chain in the premolten globule state are summarized below. The protein molecule in this state is denatured, that is, it has no rigid tertiary structure. It is characterized by a considerable secondary structure, although much less pronounced than that of the native or the molten globule protein (protein in the premolten globule state has
50% native secondary structure, whereas in the molten globule state the corresponding value is close to 100%). The protein molecule in the premolten globule state is considerably less compact than in the molten globule or native states, but it is still more compact than the random coil (its hydrodynamic volume in the molten globule, the premolten globule, and the unfolded states, in comparison to that of the native state, increases 1.5,
3, and
12 times, respectively). The protein molecule in the premolten globule state can effectively interact with the hydrophobic fluorescent probe ANS, although essentially weaker than in the molten globule state. This means that at least part of the hydrophobic clusters of polypeptide chain accessible to the solvent is already formed in the premolten globule state (Uversky and Ptitsyn 1994, 1996a; Ptitsyn 1995; Uversky 1997, 1998). It has also been established that in the premolten globule state the protein molecule has no globular structure (Uversky 1997, 1998; Uversky et al. 1998). The last observation indicates that the premolten globule probably represents a "squeezed" and partially ordered form of a coil. Finally, it has been shown that the premolten globule is separated from the molten globule state by an all-or-none transition, which represents an intramolecular analog of the first-order phase transition (Uversky and Ptitsyn 1994, 1996a; Ptitsyn 1995; Uversky 1997, 1998). This means that the molten globule and premolten globule represent divers thermodynamic (phase) states.
As native, molten globule, premolten globule, and unfolded conformations possess defined structural differences along with increasing amounts of disorder, they may be easily discriminated from one another by several physico-chemical methods (Uversky 1999). These techniques are briefly considered below.

eff (which is the difference in the number of denaturant molecules "bound" to one protein molecule in its two states) should be determined. Then this quantity should be compared to the 
effN
U and 
effMG
U values corresponding to the native to coil and molten globule to coil transitions in globular protein of a given molecular mass, respectively (Uversky and Ptitsyn 1996b). Application of several techniques mentioned above to a given protein provides the most unambiguous evidence for the presence of partially folded intermediates.
It has been shown that a considerable number of proteins possess some amount of disorder rather than rigid structure. A special term, "natively denatured," was introduced in 1994 (Schweers et al. 1994) to emphasize the existence of a drastic structural difference between "normal" globular protein, with rigid tertiary structure, and an "abnormal" extremely flexible tau protein. Two years later, a new term "natively unfolded" originated as a result of conformational analysis of
-synuclein, which under physiological conditions appeared to lack any secondary structure (Weinreb et al. 1996). Two alternative terms, "intrinsically unstructured" (Wright and Dyson 1999) and "intrinsically disordered" (Dunker et al. 2001), have also been suggested to describe these proteins. Because "abnormal" proteins show an extremely wide diversity in their structural properties, the meaning of the above terms should be clarified. Thus, the terms denatured and disordered may be considered as synonyms, and indicate any set of nonrigid conformations of polypeptide chains including different compact partially folded conformations: molten globules and premolten globules, and random coil. The terms unstructured and unfolded may be considered synonymous, and should only be applied to the subset of disordered proteins characterized by the absence of any (or almost any) ordered structure. For the remaining of this review, only natively unfolded proteins will be considered, excluding "native molten globules".
Are natively unfolded proteins common?
The number of proteins and protein domains that have been shown in vitro to have little or no ordered structure under physiological conditions is rapidly expanding. For example, over the past 10 years there has been a significant increase in publications describing structural properties of intrinsically unstructured (natively unfolded) proteins, starting from two papers in 1991 and ending with more than 30 in 2000. During the same time other interesting aspects of natively unfolded proteins have also been investigated. For example, 2382, 1960, and 370 papers were published concerning different aspects of amyloid-beta peptide, tau protein, and
-synuclein, respectively.
The current list of different natively unfolded proteins includes more than 100 entries, with information on 91 of them presented in our recent work (Uversky et al. 2000a) and Table 1
. Only full-length proteins or domains with chain length greater than 50 amino acid residues have been considered. This list would probably be doubled if shorter polypeptides 30 to 50 residues long were included. Finally, the set of 100 proteins described in the literature as "natively unfolded" have at least 250 homologs, which are also expected to be natively unfolded. Additionally, a large number of proteins and protein domains have been predicted to be disordered based on the results of the analysis of amino acid sequences using the neuronal network predictors (Dunker et al. 1998, 2001; Romero et al. 1998b). All this shows that polypeptides without ordered structure under physiological are common, rather than exceptions.
|
The existence of at least three different disordered equilibrium conformations, molten globule (MG), premolten globule (PMG), and unfolded (random coil-like, U), has been established for typical globular proteins (Uversky and Ptitsyn 1994, 1996a; Ptitsyn 1995; Uversky 1997, 1998). Apparently, the ability of a protein to adopt different stable conformations is an intrinsic property of a polypeptide chain. Although the correct folding of a protein into its rigid biologically active conformation is thought to be determined by its amino acid sequence (Anfinsen et al. 1961), the absence of rigid structure in natively unfolded proteins may be reflected in specific features of their amino acid sequences.
In an attempt to understand the relationship between sequence and disorder, Dunker and coauthors have developed several neuronal network predictors (Romero et al. 1997, 1998a, 1998b, 2001a; Dunker et al. 1998, 2001; Garner et al. 1998; Li et al. 1999, 2000). They assumed that if a protein structure has evolved to have a functional disordered state then a propensity for disorder might be predictable from its amino acid sequence and composition. The results of such analysis were impressive. It was established that disordered regions shared at least some common sequence features between many proteins, and that more than 15,000 proteins in the Swiss Protein database were identified as having long regions of sequence that shared these features (Romero et al. 1998b). Interestingly, the Top 20 proteins (i.e., proteins with the highest scores) were shown to have low sequence complexity, as defined by Wootton (1993, 1994; Wootton and Federhen 1996). In other words, sequences of natively unfolded proteins may be essentially degenerate. Figure 2A
illustrates this idea, comparing the s-antigen from Plasmodium (which was shown to be at the head of the Top 20) with that of human serum albumin (a rigid globular protein of similar molecular mass) in terms of their amino acid composition scaled according to McCaldon and Argos (1988). Interestingly, it was later established that the distributions of the complexity values for ordered and disordered sequences overlapped (Romero et al. 2001b), suggesting that low sequence complexity did not represent the only characteristic feature of intrinsically disordered proteins. However, some general sequence peculiarities of natively unfolded proteins have been recognized long ago. These include the presence of numerous uncompensated charged groups, resulting in a large net charge at neutral pH (Hemmings et al. 1984; Gast et al. 1995; Weinreb et al. 1996) and a low content of hydrophobic amino acid residues (Hemmings et al. 1984; Gast et al. 1995).
|
![]() | ((1)) |
This equation gives the estimation of the "boundary" mean hydrophobicity value,
H
boundary, below which a polypeptide chain with a given mean net charge
R
will most probably be unfolded. Thus, sequences of natively unfolded proteins may be characterized by a low sequence complexity and/or high net charge coupled with low mean hydrophobicity.
How unfolded are intrinsically unstructured proteins? A classification attempt
It is well known that in the presence of large concentrations of strong denaturants, such as 8 M urea or 6 M GdmCl, normal proteins lose the majority of their specific structure, that is, become essentially unfolded (Anson and Mirsky 1932; Mirsky and Pauling 1936; Neurath et al. 1944; Tanford 1968). One can expect that under these conditions unfolded proteins will obey the theoretical and empirical rules that apply to linear random coils (Tanford 1968). In accordance with Tanford, a polymer molecule is randomly coiled when internal rotation can take place at about every single bond of the molecule with the same freedom with which it would take place in a molecule of low molecular weight containing the same kind of bonds (Tanford 1968). The properties of linear random coils are well understood, as synthetic polymers frequently adopt this conformation (Flory 1953; Tanford 1961). Because the dimensions of random coils depend only on the backbone rotational angles, the dependence of the hydrodynamic dimensions on molecular mass (length of polypeptide chain) represents the most effective diagnostic tool for recognition of linear random coils (Tanford 1961, 1968). Results of early studies on hydrodynamic dimensions of proteins in the presence of 6 M GdmCl were consistent with the conclusion that unfolded proteins could be described as random coils (Tanford 1961, 1968). However, it was later established by heteronuclear NMR that even in high concentrations of strong denaturants, when the native state of globular proteins breaks down, the polypeptide chains contained some amount of residual structure, that is, the polypeptide chain did not reach a random coil conformation (Dill and Shortle 1991; Logan et al. 1994; Zhang et al. 1994; Shortle 1996; Pappu et al. 2000; Baldwin and Zimm 2000). These findings raised several compelling biophysical questions related to the structural characteristics of natively unfolded proteins. How unfolded are these proteins? Are they random coils, or do they possess residual structure? If they have residual structure, how then should they be classified? Fortunately, the information accumulated to date on natively unfolded proteins allows us to make an initial structural classification of these intriguing members of the polypeptide kingdom.
As it follows from their definition, intrinsically unstructured proteins show complete (or almost complete) loss of any ordered structure under physiological conditions in vitro; that is, they should behave as random coils. Structurally, this may be manifested by (1) larger hydrodynamic dimensions compared to typical native globular proteins with corresponding molecular mass, (2) low content of ordered secondary structure, and (3) high intramolecular flexibility. Such anomalous behavior is usually detected by numerous hydrodynamic techniques (gel-filtration, viscometry, SAXS, SANS, sedimentation, and dynamic and static light scattering), far-UV CD, ORD, FTIR, and NMR spectroscopy (one-dimensional and heteronuclear multidimensional). These techniques may also be used for the identification of residual structure (if any) in an unfolded protein molecule. Once again, simultaneous application of several approaches should permit one to make more reliable conclusion.
Flexibility and residual structure by NMR spectroscopy
NMR spectroscopy of natively unfolded proteins has established that they contain varied amounts of residual structure. Examples of proteins which are essentially unfolded under in vitro physiological conditions include: DFF45 N-terminal domain (Zhou et al. 2001), DNA-binding domain of vitamin D receptor (Craig et al. 1997), C-terminal domain of anti-sigma factor FlgM (Daughdrill et al. 1998), p53 regulatory domain of p19Arf tumor suppressor (DiGiammarino et al. 2001), substrate-binding peptide from DNA polymerase I (Mullen et al. 1993), poplar apo-plastocyanin (Bai et al. 2001), N-terminal domain of StAR (Song et al. 2001), bone sialoprotein and osteopontin (Fisher et al. 2001), C-terminal domains of
- and ß-tubulins (Jimenez et al. 1999), N-terminal activation domain of heat-shock transcription factors (Cho et al. 1996); 4E-binding proteins I and II (Fletcher et al. 1998), cyclin-dependent kinase inhibitor p21Waf1/Cip1/Sdi1 (Kriwacki et al. 1996), SNase,
131
fragment (Alexandrescu et al. 1994; Gillespie and Shortle 1997a, 1997b), dessication-related protein (Lisse et al. 1996), functional domain of eIF4G1 (Hershey et al. 1999), cytoplasmic domain of synaptobrevin (Hazzard et al. 1999), N-terminal domain of prion protein (Donne et al. 1997), C-terminal HMG domain of LEF-1 (Love 1999), N-terminal region of TAFII-2301177 (Liu et al. 1998), antitermination protein N (Mogridge et al. 1998), cytoplasmic domain of Snc1 (Fiebig et al. 1999), prothymosin
(Uversky et al. 1999), nonhistone chromosomal proteins HMG-14 (Cary et al. 1980), HMG-17 (Abercrombie et al. 1978), HMG-T and HMG-H6 (Cary et al. 1981), fibronectin-binding domains (Penkett et al. 1998), DNA-binding domain of GCN4 (Weiss et al. 1990), EMB-1 protein (Eom et al. 1996), NEF protein (Geyer et al. 1999), osteocalcin (Isbell et al. 1993), two-domain fragment of neutral zinc finger factor 1 (Berkovits and Berg 1999), and several other proteins. On the other hand, heteronuclear multidimensional NMR analysis provided evidence of some ordered structure in several natively unfolded proteins, although the amount and quality of residual structure varied tremendously. In some cases the authors were unable to detect any secondary or tertiary contacts (e.g., Abercrombie et al. 1978; Cary et al. 1980, 1981; Cho et al. 1996; Eom et al. 1996; Lisse et al. 1996; Penkett et al. 1997; Fletcher et al. 1998; Zhang and Matthews 1998; Bose et al. 1999; Uversky et al. 1999; Campbell et al. 2000; Fisher et al. 2001; DiGiammarino et al. 2001). In other cases it was concluded that the proteins contained mostly dynamic structure favoring helical or ß-structural conformation (e.g., Mullen et al. 1993; Schmitz et al. 1994; Kriwacki et al. 1996; Gillespie and Shortle 1997a, 1997b; Daughdrill et al. 1998; Bai et al. 2001, and many others). Thus, NMR analysis clearly showed that natively unfolded proteins do not possess uniform structural properties, as expected for members of a single thermodynamic entity.
Hydrodynamic dimensions
As previously discussed, the most unambiguous characteristic of the conformational state of a globular protein is its hydrodynamic dimension. In fact, it has been shown that equilibrium conformations of a globular protein (native, molten globule, premolten globule, and unfolded states) may easily be discriminated by the degree of compactness of the polypeptide chain (Uversky 1993, 1994, 1997, 1998; Uversky and Ptitsyn 1994, 1996a; Ptitsyn 1995). The equilibrium conformations were characterized by very different dependencies of their hydrodynamic dimensions on molecular mass (length of polypeptide chain) (Tanford 1961, 1968; Uversky 1993; Tcherkasskaya and Uversky 2001).
To clarify the physical nature of natively unfolded proteins, Figure 3
represents the dependencies of their hydrodynamic dimensions on the length of their polypeptide chains (see Table 1
). The same trends determined for globular proteins in their native, molten globule, premolten globule, and urea or GdmCl unfolded states are shown for comparison. The values of the hydrodynamic volumes, Vh, were calculated from the corresponding Stokes radii, RS, as Vh = 4/3
RS3. Data for the different conformations of globular proteins were taken from Tcherkasskaya and Uversky (2001). For these species, the dependencies of Vh on the length of the polypeptide chains, N, were described by a set of straight lines (using a logarithmic scale):
|
| ((2)) |
|
| ((3)) |
|
| ((4)) |
|
| ((5)) |
|
| ((6)) |
|
Figure 3
shows that natively unfolded proteins are much less compact compared to native and molten globule proteins of similar molecular mass. Surprisingly, under physiological conditions in vitro (i.e., in aqueous solution, a poor solvent for a polypeptide chain), natively unstructured proteins are split in two different subclasses (see also Table 1
). Proteins from the first subclass, which consists of 17 representatives, behave as random coils in poor solvent, whereas the 18 proteins of the second subclass are essentially more compact, being close to premolten globules as it follows from their hydrodynamic parameters (cf. equations 35![]()
![]()
):
|
| ((7)) |
|
| ((8)) |
Residual secondary structure from far-UV CD spectra
Unfolded polypeptide chains are characterized by very specific shapes of their far-UV CD spectrum, with an intensive minimum in the vicinity of 200 nm and an ellipticity close to zero in the vicinity of 222 nm (Adler et al. 1973; Provencher and Glöckner 1981; Johnson 1988; Woody 1995; Fasman 1996; Kelly and Price 1997; Uversky 1999). This is a very useful graphical criterion for the selection of natively unfolded proteins (see Table 1
). To date, a coil-like shape of far-UV CD spectrum has been reported for
100 proteins (see Table 1
), which is almost threefold larger than the number of proteins shown to be unfolded in accordance with their hydrodynamic dimensions (35).
Figure 4
represents a "double wavelength" plot, [
]222 versus [
]200 that may be used to assort natively unfolded proteins into two nonoverlapping groups. Fifty-one proteins were characterized by far-UV CD spectra characteristic of almost completely unfolded polypeptide chains: with [
]200 =
(18,900 ± 2800) degcm2dmol-1 and [
]222 = -(1700 ± 700) deg cm2 dmol-1. On the other hand, 44 other protein spectra were consistent with the existence of some residual secondary structure, possessing shape typical of the premolten globule state of globular proteins (with [
]200 =-(10,700 ± 1300) degcm2dmol-1 and [
]222 = -(3900±1100) deg cm2 dmol-1).
|
Amino acid composition of native coils and native premolten globules
Figure 5
compares the amino acid compositions of intrinsic coils and intrinsic premolten globules. Protein sets analyzed in Figure 4
were used to create these graphs. The inset to Figure 5
shows that proteins from both subclasses occupy the same region of the charge-hydrophobicity phase space. Native coils were more dispersed, whereas intrinsic premolten globules were localized closer to the border between intrinsically unstructured and native proteins (cf. Fig. 2
). To confirm this idea, the corresponding distances between the given sequence and the border between intrinsically unstructured and native proteins were calculated. Results of this comparison are shown in Figure 5
as 
H
= (
H
boundary -
H
) plots. The mean "boundary" hydrophobicity,
H
boundary, for a given polypeptide chain with a mean net charge
R
has been calculated using equation 1
. One can see that intrinsic coils are essentially more distant from the border than intrinsic premolten globules. Statistical analysis shows that the averaged 
H
values are -(0.089 ± 0.086) and -(0.037 ± 0.033) for the native coils and native premolten globule, respectively. However, because the sequence characteristics of the two subclasses overlap, it is difficult to differentiate these proteins by taking into account their mean hydrophobicity and mean net charge only. Probably, some other sequence features, such as propensity to form secondary structure should also be considered.
|
The functional importance of being disordered has been intensively analyzed (Schulz 1979; Pontius 1993; Dunker et al. 1997, 2001; Plaxco and Gross 1997; Wright and Dyson 1999). It has been established that increased intrinsic plasticity represents an important prerequisite for effective molecular recognition (Plaxco and Gross 1997; Wright and Dyson 1999; Dunker et al. 2001). The variety diapason of biological functions for intrinsically disordered proteins is very wide, including cell cycle control, transcriptional and translational regulation, modulation of activity and/or assembly of other proteins, and even regulation of nerve cell function (reviewed in Wright and Dyson 1999; Dunker et al. 2001). Importantly, it must be emphasized that the majority of intrinsically disordered proteins undergo a disorder-to-order transition upon functioning (Schulz 1979; Pontius 1993; Spolar and Record 1994; Rosenfeld et al. 1995; Plaxco and Gross 1997; Dunker et al. 1997, 2001; Wright and Dyson 1999). It has been suggested that the persistence of natively unfolded proteins throughout evolution may reside in advantages of flexible structure during disorderorder transitions in comparison with rigid proteins (Dunker et al. 1997, 1998, 2001; Romero et al. 1998b; Wright and Dyson 1999). Among the potential advantages of intrinsic lack of structure and function-related disorderorder transitions are (1) the possibility of high specificity coupled with low affinity (Schulz 1979; Kriwacki et al. 1996; Dunker et al. 1998, 2001); (2) the ability of binding to several different targets (Wright and Dyson 1999; Dunker et al. 2001), known as one to many signaling (Romero et al. 1998b); (3) the capability to overcome steric restrictions, enabling essentially larger interaction surfaces in the complex than could be obtained for the rigid partners (Meador et al. 1992; Choo and Schwabe 1998; Dunker et al. 2001); (4) the precise control and simple regulation of the binding thermodynamic (Schulz 1979; Spolar and Record 1994; Rosenfeld et al. 1995; Wright and Dyson 1999; Dunker et al. 2001); (5) the increased rates of specific macromolecular association (Pontius 1993; Dunker et al. 2001); and (6) the reduced lifetime of intrinsically disordered proteins in the cell, possibly representing a mechanism of rapid turnover of the important regulatory molecules (Wright and Dyson 1999). There is, however, an alternative explanation for the involvement of intrinsic disorder in protein function. By computer modeling it has been shown that selective pressure for functionality is rather unrelated to that for stability and foldability. In this view, a protein that is successfully folded into one structure would likely be as functional as a protein that successfully folded into an alternative structure. Including functionality in the model does not greatly alter the distribution of the observed structures (Williams et al. 2001). This could mean that metastable proteins are favored during evolution because there is a tremendously larger amount of sequences coding for these proteins compared to the very rigid ones. In other words, the involvement of intrinsic disorder in protein function may be related to history and evolution rather than to functional needs (Dunker et al. 2001).
In their excellent review, Dunker et al. (2001) formulated the idea that the protein structurefunction paradigm (which emphasizes that ordered 3D structures represent the indispensable prerequisite to the effective protein functioning) should be altered as The Protein Trinity paradigm (see Fig. 6A
). According to The Protein Trinity model, native intracellular proteins (or their functional regions) can exist in any of the three thermodynamic states, ordered, molten globule, and random coil. Function can arise from any of the three conformations and transitions between them. "In this view, not just the ordered state, but any of the three states can be the native state of a protein" (Dunker et al. 2001). Experimental results on the conformational behavior of intrinsically unstructured (natively unfolded) proteins indicated, however, that these proteins did not possess uniform structural properties, as expected for members of one thermodynamic group, random coils. They were split into two structurally different subclasses, which, by analogy with conformational states of globular proteins, may be designated as intrinsic coils and intrinsic premolten globules. Moreover, it was already noted that molten globule and premolten globule might represent different phase states of the protein, as they are separated by the first-order phase transition (Uversky and Ptitsyn 1994, 1996a; Ptitsyn 1995; Uversky 1997, 1998). These observations bring a new player, the native premolten globule, on the protein functioning field. In other words, The Protein Trinity should be extended to The Protein Quartet model, with function arising from four specific conformations (ordered forms, molten globules, premolten globules, and random coils) and transitions between any two of the states (see Fig. 6B
). Experimental evidences for the validity of this extension are presented below.
|
1 (Grottesi et al. 1998), prothymosin
(Uversky et al. 2000b), human sperm protamines P2 and P3 (Gatewood et al. 1996), and phosphodiesterase
-subunit (Uversky et al. 2002). Human
-synuclein was also shown to be partially folded in the presence of several divalent and trivalent metal ions (Uversky et al. 2001a). Analysis of structural changes associated with the cation binding showed that the transformation of intrinsic coils into premolten globule-like conformations took place in these cases.
Function-related coilmolten globule transitions
The myelin basic protein, MBP, is a major protein of myelin, the multilamellar membranous sheath surrounding nerve axons. MBP was isolated in water-soluble or detergent-soluble form together with endogenous myelin lipids. The water-soluble form is a member of the intrinsic coil family. Binding of lipids transformed this protein into the molten globule-like conformation (Polverini et al. 1999). Similar structural rearrangements were induced in the coil-like 77262 fragment of the glucocorticoid receptor (Baskakov et al. 1999) and in the N-terminal domain of HIV-1 integrase (Zheng et al. 1996) by TMAO and by Zn2+ binding, respectively. Self-association of an intrinsically unfolded
-subunit of phosphodiesterase induced folding of this protein into a molten globule-like conformation (Uversky et al. 2002).
Function-related coilrigid structure transitions
The N-terminal domain of the caspase-activated DNA fragmentation factor DFF45 was unfolded in solution. Its folding into the rigid 3D structure was induced upon interaction with the N-terminal domain of DFF40 (Zhou et al. 2001). Structural analysis revealed that the isolated 50S ribosomal proteins, L22 and L27, and 30S ribosomal protein S19 were essentially unfolded in solution (Venyaminov et al. 1981). However, they transform into a rigid well-folded conformation in the functional ribosome (Yusupov et al. 2001).
Function-related premolten globulemolten globule transitions
The human antibacterial peptide LL-37 existed in the premolten globule like conformation at micromolar concentrations in aqueous solution. A cooperative transition from a disordered to a helical molten globule-like structure was observed in the presence of several anions or with increasing protein concentration. The extent of
-helicity correlated well with the antibacterial activity of LL-37 against both Gram-positive and Gram-negative bacteria (Johansson et al. 1998). Comparably the degree of folding was induced in osteocalcin as a result of binding of Ca2+, Lu3+ (Isbell et al. 1993) or Pb3+ (Dowd et al. 2001). Premolten globule to molten globule transitions accompanied Ca2+ binding to skeletal muscle sarcoplasmic reticulum calsequestrin (Cozens and Reithmeier 1984; He et al. 1993) and SPARC, an extracellular glycoprotein expressed in mineralized and nonmineralized tissues (Engel et al. 1987).
The DNA-binding domain of the 1,25-dihydroxyvitamin D3 receptor was shown to undergo a premolten globuleto-molten globule transition as a result of specific Zn2+ binding. This cation-induced folding was an important prerequisite to the formation of functional complex with osteopontin and several vitamin D response elements (Craig et al. 1997). The first step in steroidogenesis is the movement of cholesterol from the outer to inner mitochondrial membrane, which is facilitated by the steroidogenic acute regulatory protein StAR. The interaction of premolten globule-like StAR with dodecylphosphocholine and phospholipid liposomes was accompanied by transition of the protein into the molten globule conformation (Song et al. 2001). The E7 gene of the human papillomaviruses encodes a 98-amino acid chain of a multifunctional nuclear phosphoprotein, E7 protein, which cooperates with an activated ras oncogene to transform primary rodent cells. CD spectroscopy indicated that Zn2+ and Cd2+ binding by the HPV16 E7 protein induced structural transformations consistent with premolten globulemolten globule transition (Pahel et al. 1993).
Transitions of intrinsic premolten globules to rigid conformation
It was shown that self-dimerization and DNA binding induce rigid 3D structure in the intrinsic premolten globule-like Max protein (Ferre-D'Amare et al. 1993; Horiuchi et al. 1997). Specific Ca2+ binding initiated folding of the premolten globule-like B-repeat segment of SdrD (Josefsson et al. 1998). Similarly, the 50S ribosomal proteins L2, L3, L14, L23, L24, and L32, as well as the 30S ribosomal proteins S12 and S18 were native premolten globules in their free forms (Venyaminov et al. 1981), but adopted rigid well-folded conformations during the formation of a functional ribosome (Yusupov et al. 2001).
The Escherichia coli RNase HI variant, with the K86A mutation, was purified in two forms: nicked and intact. The nicked protein, resulting from the cleavage of a Lys87Arg88 peptide bond, was enzymatically active. The N-terminal fragment possessed characteristics of molten globule, whereas the C-terminal fragment was essentially disordered. The premolten globule-like C-fragment underwent a transition to the rigid 3D structure as a result of RNase HI reconstitution (Kanaya and Kanaya 1995). Comparably, the intrinsically unstructured ß-subunit of SMK killer toxin folded into a rigid conformation as a result of interaction with the
-subunit (Suzuki et al. 1997). Finally, the formation of a yeast SNARE complex was accompanied by a complete folding of the two of its components, Snc1 and Sec9 (Rice et al. 1997).
RNase P is the endoribonuclease responsible for the 5`-maturation of precursor tRNA transcript. Intriguingly, RNase P from Bacillus subtilis, being predominantly unfolded in 10 mM sodium cacodilate at neutral pH, folded into a native
/ß structure upon addition of various small molecular anions (Henkels et al. 2001). This protein (Henkels et al. 2001), as well as the reduced RNase T1 (Baskakov and Bolen 1998), also underwent a cooperative folding transition upon addition of the osmolyte TMAO.
The cyclin-dependent kinase (Cdk) inhibitor p21Waf1/Cip1/Sdi1 and its N-terminal fragment lack stable secondary and tertiary structure in the free solution state. In sharp contrast to the disordered free solution state, these proteins adopted an ordered stable conformation when bound to Cdk2 (Kriwacki et al. 1996).
Conclusions
Conformational analysis shows that natively unfolded proteins do not represent a uniform family, but rather two structurally different groups. Proteins from the first group have hydrodynamic dimensions typical of random coils in poor solvent (i.e., they behave as slightly squeezed coils) and do not possess any (or almost any) ordered secondary structure. Proteins from the second group are essentially more compact (but still significantly less compact than native or molten globule proteins). They exhibit some amount of ordered secondary structure being characterized by far-UV CD spectra as typical essentially disordered polypeptide chain, with a pronounced minimum in the vicinity of 200 nm. By analogy with the conformational classification of "normal" globular proteins, intrinsically unstructured proteins could be divided in intrinsic coils and intrinsic premolten globules.
Because the amino acid sequences of native coils are similar to native premolten globules (only slightly less hydrophobic and slightly more charged), some other sequence features (e.g., propensity to form secondary structure) have to be taken into account for the unambiguous sequence-based separation of the intrinsic coils from the intrinsic premolten globules.
An intriguing property of intrinsically unstructured proteins is their capability to undergo disorder-to-order transition upon functioning. The degree of these structural rearrangements varies over a very wide range, from coilpremolten globule transitions to formation of rigid ordered structures. Thus, protein functioning may be described by the Protein Quartet model, with biological activity arising from four unique conformations of the polypeptide chain (ordered forms, molten globules, premolten globules, and random coils) and transitions between any of them.
Acknowledgments
I am grateful to Prof. A.K. Dunker for the valuable discussions. I thank Dr. P. Souillac for the careful reading and editing of the manuscript. I appreciate Prof. J. Goers for his invaluable help with the manuscript improvement.
References
Abercrombie, B.D., Kneale, G.G., Crane Robinson, C., Bradbury, E.M., Goodwin, G.H., Walker, J.M., and Johns, E.W. 1978. Studies on the conformational properties of the high-mobility-group chromosomal protein HMG 17 and its interaction with DNA. Eur. J. Biochem. 84: 173177.[Medline]
Adler, A.J., Greenfield, N.J., and Fasman, G.D. 1973. Circular dichroism and optical rotatory dispersion of proteins and polypeptides. Methods Enzymol. 27: 675735.[Medline]
Agianian, B., Leonard, K., Bonte, E., Van der Zandt, H., Becker, P.B., and Tucker, P.A. 1999. The glutamine-rich domain of the Drosophila GAGA factor is necessary for amyloid fiber formation in vitro, but not for chromatin remodelling. J. Mol. Biol. 285: 527544.[CrossRef][Medline]
Alber, T., Gilbert, W.A., Ponzi, D.R., and Petsko, G.A. 1982. The role of mobility in the substrate binding and catalytic machinery of enzymes. Ciba Found. Symp. 93: 424.
Alexandrescu, A.T., Abeygunawardana, C., and Shortle, D. 1994. Structure and dynamics of a denatured 131-residue fragment of staphylococcal nuclease: A heteronuclear NMR study. Biochemistry 33: 10631072.[CrossRef][Medline]
Amit, A.G., Mariuzza, R.A., Phillips, S.E.V., and Dolyak, R.J. 1985. Three-dimensional structure of an antigenantibody complex at 6 Å resolution. Nature 313: 156158.[CrossRef][Medline]
Anfinsen, C.B., Haber, E., Sela, M., and White, F.N. 1961. Kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proc. Natl. Acad. Sci. 47: 13091314.
Anson, M.L. and Mirsky, A.E. 1932. The effect of denaturation on the viscosity of protein systems. J. Gen. Physiol. 15: 341350.
Aswad, D.W. and Greengard, P. 1981. A specific substrate from rabbit cerebellum for guanosine 3`:5`-monophosphate-dependent protein kinase. I. Purification and characterization. J. Biol. Chem. 256: 34873493.
Bai, Y., Chung, J., Dyson, H.J., and Wright, P.E. 2001. Structural and dynamic characterization of an unfolded state of poplar apo-plastocyanin formed under nondenaturing conditions. Protein Sci. 10: 10561066.
Baldwin, R.L. and Zimm, B.H. 2000. Are denatured proteins ever random coils? Proc. Natl. Acad. Sci. 97: 1239112392.
Baskakov, I. and Bolen, D.W. 1998. Forcing thermodynamically unfolded proteins to fold. J. Biol. Chem. 273: 48314834.
Baskakov, I.V., Kumar, R., Srinivasan, G., Ji, Y.S., Bolen, D.W., and Thompson, E.B. 1999. Trimethylamine N-oxide-induced cooperative folding of an intrinsically unfolded transcription-activating fragment of human glucocorticoid receptor. J. Biol. Chem. 274: 1069310696.
Baum, J., Dobson, C.M., Evans, P.A., and Hanly, C. 1989. Characterization of a partly folded protein by NMR methods: Studies on the molten globule state of guinea pig alpha-lactalbumin. Biochemistry 28: 713.[CrossRef][Medline]
Belmont, L.D. and Mitchison, T.J. 1996. Identification of a protein that interacts with tubulin dimers and increases the catastrophe rate of microtubules. Cell 84: 623631.[CrossRef][Medline]
Berkovits, H.J. and Berg, J.M. 1999. Metal and DNA binding properties of a two-domain fragment of neural zinc finger factor 1, a CCHC-type zinc binding protein. Biochemistry 38: 1682616830.[CrossRef][Medline]
Bhattacharyya, J. and Das, K.P. 1999. Molecular chaperone-like properties of an unfolded protein, alpha(s)-casein. J. Biol. Chem. 274: 1550515509.
Bienkiewicz, E.A., Moon Woody, A., and Woody, R.W. 2000. Conformation of the RNA polymerase II C-terminal domain: Circular dichroism of long and short fragments. J. Mol. Biol. 297: 119133.[CrossRef][Medline]
Bloomer, A.C., Champness, J.N., Bricogne, G., Staden, R., and Klug, A. 1978. A protein disk of tobacco mosaic virus at 2.8 Å resolution showing the interactions within and between subunits. Nature 276: 362368.[CrossRef]
Bode, W., Schwager, P., and Huber, R. 1978. The transition of bovine trypsinogen to trypsin-like state upon strong ligand binding. The refined crystal structures of the bovine trypsinogenpancreatic trypsin inhibitor complex and of its ternary complex with Ile-Val at 1.9 Å resolution. J. Mol. Biol. 118: 99112.[CrossRef][Medline]
Bogdarina, I., Fox, D.G., and Kneale, G.G. 1998. Equilibrium and kinetic binding analysis of the N-terminal domain of the Pf1 gene 5 protein and its interaction with single-stranded DNA. J. Mol. Biol. 275: 443452.[CrossRef][Medline]
Bose, H.S., Whittal, R.M., Baldwin, M.A., and Miller, W.L. 1999. The active form of the steroidogenic acute regulatory protein, StAR, appears to be a molten globule. Proc. Natl. Acad. Sci. 96: 72507255.
Bouvier, M. and Stafford, W.P. 2000. Probing the three-dimensional structure of human calreticulin. Biochemistry 39: 1495014959.[CrossRef][Medline]
Bracken, C. 2001. NMR spin relaxation methods for characterization of disorder and folding in proteins. J. Mol. Graph. Model. 19: 312.[CrossRef][Medline]
Bushnell, G.W., Louie, G.V., and Brayer, G.D. 1990. High-resolution three-dimensional structure of horse heart cytochrome c. J. Mol. Biol. 214: 585595.[CrossRef][Medline]
Campbell, K.M., Terrell, A.R., Laybourn, P.J., and Lumb, K. J. 2000. Intrinsic structural disorder of the C-terminal activation domain from the bZIP transcription factor Fos. Biochemistry 39: 27082713.[CrossRef][Medline]
Cary, P.D., Crane Robinson, C., Bradbury, E.M., and Dixon, G.H. 1981. Structural studies of the non-histone chromosomal proteins HMG-T and H6 from trout testis. Eur. J. Biochem. 119: 545551.[Medline]