|
|
||||||||
1 Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
2 Biology Department, Brookhaven National Laboratory, Upton, New York 11973, USA
3 Department of Biology, Washington University, St. Louis, Missouri 63130, USA
(RECEIVED June 19, 2006; FINAL REVISION August 21, 2006; ACCEPTED August 22, 2006)
| Abstract |
|---|
|
|
|---|
-helix, or Rfr-fold, as observed for the tandem pentapeptide repeats in the only other PRP structure, the mycobacterial fluoroquinoline resistance protein MfpA from Mycobacterium tuberculosis. Sitting on top of the Rfr-fold are two short, antiparallel
-helices, bridged with a disulfide bond, that perhaps prevent edge-to-edge aggregation at the C terminus. Analysis of the main-chain (
,
) dihedral orientations for the pentapeptide repeats in Rfr32 and MfpA makes it possible to recognize the structural details for the two distinct types of four-residue turns adopted by the pentapeptide repeats in the Rfr-fold. These turns, labeled type II and type IV
-turns, may be universal motifs that shape the Rfr-fold in all PRPs.
Keywords: cyanobacteria;
-bridges; circular dichroism; thermal melt; right-handed parallel
-helix; single-bridge
-sheet;
-bulges
| Introduction |
|---|
|
|
|---|
The first protein observed to contain pentapeptide repeats dates back to 1995 with the discovery of the hglK gene in the cyanobacterium Anabaena sp. strain PCC 7120 (Black et al. 1995). The hglK gene encodes a 727-residue protein with an N terminus predicted to contain four membrane-spanning regions followed by a region containing 36 consecutive pentapeptide repeats of the consensus sequence ADLSG. Chemical mutagenesis was used to generate an Anabaena strain that introduced a stop codon just before the pentapeptide repeat domain in the hglK gene, producing mutants with a distinct morphology compared to the wild-type strain that were incapable of forming the thick glycolipid layer external to the cell wall. The conclusions were that the HglK protein was membrane-associated, and the pentapeptide repeat domain was necessary for glycolipid transport and/or localization during heterocyst formation. However, the precise biochemical function and the three-dimensional structure of the HglK protein remain unknown.
A 398-residue PRP, termed RfrA, with a motif organization similar to Anabaena 7120 HglK was later identified in the photosynthetic bacterium Synechocystis sp. strain 6803 (Chandler et al. 2003). Like HglK, RfrA contains four membrane-spanning regions at its N terminus followed by a shorter run of 12 consecutive pentapeptide repeats at its C terminus (Chandler et al. 2003). While RfrA appears to be involved in the regulation of a manganese transport system different from the more thoroughly characterized ABC-transporter system in Synechocystis 6803, the mechanism of regulation is unknown. Two hypotheses are that RfrA alters the expression of the second unknown manganese transporter (transcriptional) or it may reversibly modify the second transporter (post-translational).
The hglK and rfrA genes are present in the chromosomes of the cyanobacteria Anabaena and Synechocystis, respectively. In many species of nonphotosynthethic Enterobacteriaceae, plasmids encoding a protein containing tandem pentapeptide repeats have been identified (Tran and Jacoby 2002; Nordmann and Poirel 2005). Biochemical characterization of this protein, Qnr, shows that in vitro, it protects Escherichia coli DNA gyrase and E. coli DNA topoisomerase IV from the inhibitory effects of the powerful fluoroquinolone antibiotics (Tran and Jacoby 2002; Tran et al. 2005). Fluoroquinolones exert their antibacterial properties by binding reversibly to normal DNA complexes formed between DNA gyrase and DNA topoisomerase IV (Drlica and Malik 2003). They stabilize a covalent tyrosyl-DNA phosphate ester that is normally a transient intermediate, preventing religation of the DNA. As a consequence, the phospho-phenolic linkage eventually is hydrolyzed and a DNA double-strand break is generated. The accumulation of double-strand breaks is lethal to the cell (van Gent et al. 2001). The Qnr protein was observed to compete with DNA for binding to DNA gyrase (Tran et al. 2005), suggesting that the antibiotic resistance provided by this PRP may be due to its interaction with DNA gyrase that prevents normal DNA binding. Structural evidence for such a mechanism was recently provided with the crystal structure of the first PRP, MfpA, from Mycobacterium tuberculosis (Hegde et al. 2005).
The structure of M. tuberculosis MfpA (mycobacterial fluoroquinoline resistance protein), a 183-residue protein that forms a dimer in solution, has recently been reported (Hegde et al. 2005). It was targeted for study because it was identified as a homolog (67% identical) to a newly discovered, 193-residue protein in Mycobacterium smegmatis that was shown to be responsible for fluoroquinolone resistance in this fast-growing Mycobacterium (Montero et al. 2001). M. tuberculosis MfpA contains 30 consecutive pentapeptide repeats, and the crystal structure revealed that they formed a novel type of right-handed quadrilateral
-helix with each pentapeptide repeat occupying one face of a nearly square repeating unit (Hegde et al. 2005). The tower-like motifs are aligned head to head in the MfpA dimer, and exhibit characteristics similar to B-form DNA, including size, shape, and predominately electronegative surface potential distribution. Indeed, the MfpA structure can be docked readily onto the crystal structure of an N-terminal construct of E. coli DNA gyrase A subunit (Morais Cabral et al. 1997), a protein with a large electropositive potential at the position where DNA is believed to bind, and act as a DNA mimic. This structural data suggesting a potential interaction between the MpfA dimer and DNA gyrase was supported by biochemical data showing that MfpA inhibits the supercoiling and relaxing activity of E. coli DNA gyrase (Hegde et al. 2005).
There are at least two other examples of bacterial plasmids encoding for proteins with pentapeptide repeats that offer antibiotic resistance. The E. coli McbG protein is responsible for resistance to the peptide antibiotic Microsin B17 (Garrido et al. 1988). Like fluoroquinolones, Microsin B17 generates DNA double-strand breaks through its interactions with DNA gyrase (Vizan et al. 1991), although details of the biochemical mechanism differ from fluoroquinolones with Microsin B17 trapping a transient intermediate in the C-terminal domain of GyrB (Pierrat and Maxwell 2005). Another example is the oxetanocin A resistance factor discovered in the oxrA resistance locus in a plasmid from Bacillus magaterium (Morita et al. 1999). Oxetanocin A and its derivatives are potent inhibitors of viral DNA polymerases and HIV reverse transcriptase (Izuta et al. 1992). Given that McbG and OxrA contain 13 and nine tandem pentapeptide repeats, respectively, it has been suggested that this may be a sufficient number of consecutive repeats to provide resistance in a mechanism similar to that proposed for MfpA and fluoroquinolones, by acting as a DNA mimic for the antibiotic's target enzyme (Vetting et al. 2006).
It is clear that one biochemical function of PRPs expressed from bacterial plasmids is to provide resistance to fluoroquinolones and other antibiotics. Compelling evidence suggests that the mechanism of resistance is via DNA mimicry (Hegde et al. 2005; Vetting et al. 2006). The origins of antibiotic resistance genes on these plasmids are likely PRP genes present in chromosomal DNA of other organisms that have functions removed from antibiotic resistance. However, little is known about the biochemical function of chromosomal PRP genes. In order to gain a better understanding of the molecular function of PRPs, we have undertaken an effort to characterize the three-dimensional structure of proteins in this family from Cyanothece 51142, a diurnal cyanobacteria with 35 chromosomal PRP genes. Here, we discuss the general features of the amino acid sequences of these 35 PRPs, present the crystal structure for Rfr32, a 167-residue protein with 21 tandem pentapeptide repeats, and describe in detail the two types of turn motifs adopted by each pentapeptide repeat.
| Results and Discussion |
|---|
|
|
|---|
15 mg/L medium). Consequently, the construct containing only Rfr32 residues V30Q167 was used for our studies because this version was expressed in higher yields in a soluble form and it is likely the active cellular form of the protein. Crystals were grown for truncated Rfr32 (V30Q167) with and without a 43-residue, N-terminal tag containing an enterokinase cleavage site. Note that the amino acid sequence of the construct used for crystallization has been renumbered such that V30 in the native Rfr32 sequence corresponds to V2 in the crystal structure discussed here and the structures deposited in the Protein Data Bank. Trigonal and tetragonal crystal forms of Rfr32 grew within a week using tagged Rfr32, whereas untagged Rfr32 crystallized only in the tetragonal form. X-ray diffraction data were collected on tagged Rfr32 (Se-Met labeled and unlabeled) in the trigonal crystal form and untagged Rfr32 (unlabeled) in the tetragonal crystal form. As listed in Table 1, both crystals were of comparable quality, with the unlabeled crystals of tagged Rfr32 diffracting to 2.1 Å resolution (PDB ID 2F3L) and Se-Met labeled crystals of untagged Rfr32 diffracting to 2.3 Å resolution (PDB ID 2G0Y). While tagged Rfr32 in the trigonal crystal form contained an N-terminal, 43-residue tag, no electron density was observed for this region. Instead, the first residue with reliable electron density was A7 (A35 in native, full-length Rfr32). SDS-PAGE analysis of protein stock solutions used for crystallization and crystals from unharvested crystallization drops suggest that the protein had not undergone proteolytic degradation before crystallization, indicating that the N-terminal tag was present but disordered in the crystals. Untagged Rfr32 only crystallized in the tetragonal crystal form, and the first residue with reliable electron density was S6 (S34 in native, full-length Rfr32). Evidently, the 43-residue tag on Rfr32 in the trigonal crystal form had little effect on the crystal structure, and this conclusion is corroborated by closer examination of the structures determined for both tagged and untagged Rfr32. The two crystal forms of Rfr32 contained only one molecule per asymmetric unit, and the structures of both molecules were essentially identical, with a backbone RMSD of 0.32 Å and an all heavy atom RMSD of 0.76 Å. Because the structure obtained for tagged Rfr32 in the trigonal crystal form diffracted to slightly higher resolution (PDB ID 2F3L), this structure is discussed in detail throughout the manuscript.
|
Overall topology of the three-dimensional structure of Rfr32
Figure 1A is a Molscript representation of Rfr32 (PDB ID 2F3L) using the Kabsch/Sander algorithm to identify regions of secondary structure. The N terminus contains 21 consecutive pentapeptide repeats that form a right-handed parallel
-helix. While such helical
-sheet structures have been observed previously (Jenkins and Pickersgill 2001), PRPs are a unique subset of this group because four consecutive pentapeptide repeats form a nearly "square," quadrilateral unit called a coil (Yoder et al. 1993; Jenkins and Pickersgill 2001; Vetting et al. 2006). These coils stack on top of one another to form a right-handed quadrilateral
-helix, or Rfr-fold. Consequently, the structure has four faces (Face 1 through Face 4) where each pentapeptide repeat on a single coil occupies one face of the tower. Rfr32 contains five complete, uninterrupted, stacked coils (labeled C1 through C5) giving the Rfr-fold a dimension of
19 Å in height and
13.4 x 12.7 x 12.4 x 12.5 Å in widths (average of Faces 1 through 4 as measured between backbone C-
carbons of the first and fifth residue of each pentapeptide repeat). The helix completes a revolution every 20 residues and travels
4.8 Å along the helix axis, a distance similar to the separation between regular parallel
-strands. There is a very slight left-handed twist to the right-handed
-helix going from the N to C terminus, as can be seen in Figure 1B, a ribbon view of the tower's backbone from the top toward the N terminus. The coils of the tower are held together by short stretches of parallel
-sheets (Face 1) and
-bridges (Faces 24), both discussed in more detail below, which are integral to the quadrilateral shape of the Rfr-fold. At the C terminus of the Rfr-fold are two antiparallel
-helices (V111C118 and T132S135). While the first
-helix projects upwards from the fifth coil (C5) at an
60° angle, the second shorter
-helix rests on top of Face 4 of C5, and this is more clearly seen in Figure 1B. Hydrogen bonds between the backbone atoms of G101 and F104 with the side chain groups of T132 and S135, respectively, help stabilize
2 on top of Face 4 of C5. An 11-residue loop between the two
-helices folds over the side of Face 4 where it is stabilized by three hydrogen bonds between N81, G100, and T103 with N125, N125, and T128, respectively. The two
-helices are linked by a disulfide bond between C118 and the penultimate residue in the protein, C138. An electron density map supporting the assignment of the disulfide bond is shown in Figure 1C. Interestingly, this disulfide bond is observed despite growing the crystals in the presence of 1.0 mM dithiothreitol, suggesting that it is protected from reduction.
|
1 and
2 torsion angles (
1 = 65 ± 6°,
2 = 85 ± 5°). While the side chain of the ith residue is predominately a large hydrophobic group, the side chain of the i 2 residue is predominately a small hydrophobic group (13/21 are Ala). As illustrated in Figure 3B, these i 2 residues also are aligned in columns. The net result is that the interior of the Rfr-fold is predominantly hydrophobic, alternating between columns of large and small side chains, and devoid of water. Indeed, the crystal structure revealed no water molecules in the interior.
|
|
Detailed description of the Rfr-fold
The general structural features of the Rfr-fold in Rfr32 are similar to those observed in the only other solved structure of a protein with an Rfr-fold, MfpA (Hegde et al. 2005; Vetting et al. 2006). Such similarities were predicted by Vetting et al. (2006) because the repeating nature of the pentapeptide sequence in PRPs suggests that they should also adopt similar repeating conformations throughout the structure. However, with a second structure of an Rfr-fold to analyze, it is possible to characterize in more detail the structural properties that may be universal to all Rfr-folds.
Figure 4 is a plot of the main chain (
,
) dihedral orientations for Rfr32 residues A7V111 that make up the Rfr-fold. Clearly, two distinct patterns are observed for the five residues constituting each coil on Face 1 and for the five residues constituting each coil on Faces 2, 3, and 4. Figure 5 is a Ramachandran plot of the data in Figure 4 coded based on position in the pentapeptide repeat with residues in Face 1 colored blue and residues in Faces 2, 3, and 4 colored red. The first observation is that only 76% of the residues in the Rfr-fold lie in the most favored region while the remaining 24% lie in the additionally allowed region. This is somewhat surprising given the high quality and resolution of the X-ray diffraction data, and suggests that there is something unique to the Rfr-fold. Closer inspection of the Ramachandran plot reveals that the (
,
) pairs for the i 2 (circles), i 1 (squares), and i + 2 (diamonds) residues are clustered into regions of the Ramachandran plot based on their position in the pentapeptide sequence regardless of their Face position in the Rfr-fold. For the ith (x) and i + 1 (+) residues, the (
,
) pairs from Face 1 are clustered into a different region than the pairs from Faces 2, 3, and 4. These four regions, circled red (i) and blue (i + 1) in Figure 5, differ by
90110° in
and
, indicating that the two distinct conformations of the pentapeptide repeat differ by an
90° rotation of the peptide unit between residue i and i + 1.
|
|
,
) dihedral orientations of the residues in the two pentapeptide conformations is shown in Table 2. The first entries are the general (
,
) dihedral orientations proposed by Vetting et al. (2006) based on their single structure, and the second value is a refinement of the general orientations based on our second structure (for specifics, see Table 3). Two features common in all the pentapeptide repeats are a
-bridge in the i 1 position of the repeat and, as illustrated in Figure 1A, an
90°, right-handed, four-residue turn between each pentapeptide repeat. A turn in a protein is defined if the carbonyl of residue i hydrogen bonds with the amide of residue i + n (Kabsch and Sander 1983). The
-turn is a very common four-residue turn (n = 3) between residues that are not in an
-helix with a C
(i) to C
(i + 3) distance of <7 Å (Richardson 1981; Shepherd et al. 1999).
-Turns effect a reversal in the direction of the protein backbone and are typically subclassified into nine different types on the basis of the main-chain (
,
) dihedral values (Wilmot and Thornton 1988). In Rfr32 the mean C
(i) (residue i) to the following C
(i + 3) (residue i 2 of the following pentapeptide repeat) distance is always <7.0 Å in all four turns of each coil. Hence, tandem pentapeptide repeats (1) contain no
-helices, (2) contain at least one
-bridge, (3) effect a change in the direction of the protein backbone, and (4) have a C
(i) to C
(i + 3) distance <7.0 Å, features characteristic of
-turns (Richardson 1981; Shepherd et al. 1999). Therefore, one way of describing the Rfr-fold is as a collection of two types of secondary structure elements,
-turns, involving residues i, i + 1, i + 2, of one pentapeptide repeat and the first residue of the following pentapeptide repeat, (i 2), connected by isolated
-bridges involving the i 1 residue (Vetting et al. 2006). These two
-turns fall into types II and IV because the main chain (
,
) dihedral values of the central two residues, i + 1 and i + 2, in the four-residue turn are within 30° of the "ideal" values for these types (Wilmot and Thornton 1988). Note that while the type II
-turn is common (Hutchinson and Thornton 1994; Pabasik et al. 2005), the type IV
-turn is a miscellaneous bin for
-turns that do not fall into any of the other categories (Richardson 1981) and is actually the most populated type (Hutchinson and Thornton 1994). As illustrated in Figure 5 and highlighted in bold in Table 2, the major difference between the two types of
-turns is the
and
torsion angles of the i and i + 1 residues due to an
90° rotation of the peptide unit between these two residues. The consequence of this single peptide unit rotation is an altered network of intercoil and intracoil hydrogen bonding as illustrated in two examples for Rfr32 in Figure 6.
|
|
|
|
-turn with the i 2 residue of the following pentapeptide. Only the i 1
-bridge residue contributes both an amide proton and carbonyl oxygen to intercoil hydrogen bonding. The main chain amide of the ith residue and the main chain carbonyl of the i + 1 residue are roughly orthogonal to the plane of the other main chain atoms of the i 2 through i + 1 residues, and consequently, they cannot form intercoil hydrogen bonds. However, in this orthogonal orientation, the carbonyl of the ith residue is near the amide of the i 2 residue where it forms an intracoil hydrogen bond as shown in the top view in Figure 6A. Such a hydrogen bond is a characteristic of a DSSP defined turn (Kabsch and Sander 1983). As illustrated by a solid blue arrow in Figure 5, the Ramachandran nomenclature (Wilmot and Thornton 1990) for the type II
-turn in the Rfr-fold is
P
.
Figure 6B illustrates the hydrogen bonding network for three coils (C2, C3, and C4) on Face 1 viewed from the front and the top. In this example, the pentapeptide repeat on each coil forms a type IV
-turn with the i 2 residue of the following pentapeptide. The approximate 90° rotation of the peptide unit between the ith and i + 1 residue places the main chain amide of the ith residue and the main chain carbonyl of the i + 1 residue into the plane of the other main chain atoms of the i 2 through i + 1 residues. Now this amide and carbonyl group can fully form intercoil hydrogen bonds with the main chain atoms of the pentapeptide above and below it if these pentapeptide repeats are also in the same type IV
-turn position, as is the case in Figure 6B. However, one consequence of this
90° rotation of the peptide unit between the i and i + 1 residue is that the main chain carbonyl of the ith residue is no longer near the amide proton of the i 2 residue, as shown in the top view in Figure 6B, and now these groups cannot form an intracoil hydrogen bond. As a result of the loss of intracoil hydrogen bonds for intercoil hydrogen bonds, the pentapeptide that forms a type IV
-turn is more extended (
0.9 Å) than the pentapeptide that forms a type II
-turn. Furthermore, because of the absence of a hydrogen bond between the carbonyl of residue i with the amide of residue i + n, this is not a "turn" as defined by the DSSP convention (Kabsch and Sander 1983). As illustrated by a blue dashed arrow in Figure 5, the Ramachandran nomenclature (Wilmot and Thornton 1990) for the type IV
-turns in the Rfr-fold is 
L, which is different from the
P
observed for the type II
-turns. This small difference is reflected in the Ramachandran nomenclature arrows for both turns shown in Figure 5, they cross but are not coincident.
One consequence of such a collection of turns in an Rfr-fold is that a type IV
-turn may not exist in isolation on the face of an Rfr-fold (e.g., a type IV
-turn enveloped by type II
-turns above and below it) because there will be no new intracoil hydrogen bond between two stacked i + 1 residues (Fig. 6B, front) to compensate for the intercoil hydrogen bond between sequential i and i 2 residues (Fig. 6A, top) that is lost in a type IV
-turn. In the only two solved PRP structures containing an Rfr-fold, type IV
-turns in isolation are not observed (see Fig. 8 below). However, the situation is not that simple because the side chain of the i 2 residue is oriented inside the core of the Rfr-fold (Fig. 3A). In type IV
-turns these residues are often serine and threonine (Hegde et al. 2005). The hydroxyl groups of these side chains are observed to hydrogen bond with their own backbone amide or the backbone carbonyl group of the ith residue on the coil directly below it (Hegde et al. 2005), acting a bit like a backbone mimic (Eswar and Ramakrishnan 1999) and providing some added stabilization to the turn. Note that if type IV
-turns must, at a minimum, be present in pairs on a face of an Rfr-fold, then they would also always meet one of the major requirements to be called a
-bulge. Bulges are believed to play an important biological role in proteins, affecting the direction of the
-sheet and the positioning of important residues for function (Chan et al. 1993). While the classical definition of a
-bulge is two residues in the bulged
-strand opposite one residue in the adjacent
-strand (Richardson et al. 1978), a
-bulge can be any irregularity in a
-sheet involving two strands (Chan et al. 1993), including irregularities directly opposite each other. In the latter example, called P-bent, the displaced residues on both strands occupy the
R region of Ramachandran space and bends the parallel
-sheet
45° (Chan et al. 1993). In Rfr32, the displaced residues, i + 2 in the pentapeptide repeat, occupies the
L region of Ramachandran space and bends the
-sheet
90°.
-Bulges in parallel sheets are very uncommon with the majority,
90%, observed in antiparallel
-sheets (Richardson et al. 1978; Chan et al. 1993).
|
-bridge at residue i 1. Only when two type IV
-turns exist on the same face of an Rfr-fold are there two adjacent
-bridges present to form a DSSP defined
-ladder, that is, at the same time, a DSSP defined
-sheet (Kabsch and Sander 1983). As illustrated in Figure 1A, all the type IV
-turns occur on Face 1 of Rfr32 to form one continuous parallel
-sheet. On the three remaining faces of Rfr32 parallel
-bridges are aligned to form a long, single-bridge
-sheet on each face. While isolated
-bridges are occasionally observed in protein structures (Richardson and Richardson 2002), to the best of our knowledge the stacked parallel
-bridges observed in the faces of the Rfr-fold are unique in protein fold "space."
Comparison with MfpA
Figures 7 and 8 are side-by-side comparisons of the crystal structures of Rfr32 (2F3L) and the only other solved structure of a protein with an Rfr-fold, MfpA (2BM5) (Hegde et al. 2005), using the structures refined to the highest resolution (2.1 and 2.0 Å, respectively). To simplify the figures, only one MfpA molecule in the C-terminal, head-to-head dimer is shown (monomer A). As illustrated in Figures 7 and 8, MfpA and Rfr32 share a similar overall architecturean N-terminal, right-handed quadrilateral
-helix (Rfr-fold) capped with a pair of
-helices. The pentapeptide repeats adopt a similar registration in both molecules, with four tandem repeats defining a coil of the tower and each coil rising
4.8 Å along the helix axis. Indeed, a Dali search (Holm and Sander 1998) using residues A7L106 of Rfr32 returns a Z-score of 19.2 and an RMSD of 1.0 Å with residues Q2G101 of MfpA indicating that the Rfr-fold is very similar in both molecules. MfpA also contains two
-helices toward the C-terminal with one helix incorporated into the last coil on Face 3 and the second one sitting over the top of the last pentapeptide repeat on Face 2.
The pentapeptide repeats in the Rfr-fold of Rfr32 and MfpA all adopt one of two general conformations, a type II or type IV
-turn. This is evident from the analysis of the data in Table 3 that lists the mean main chain
and
torsion angles for the two types of turns in Rfr32 and MfpA. The means for 17 out of 20 of the listed torsion angles are <5° apart. Of the three listed torsion angles that differ by >5°, the means still fall within the standard deviations. In Figure 7 the type II and type IV
-turns adopted by the pentapeptide repeats are colored blue and cyan, respectively, for MfpA and Rfr32. The face on the left-hand side of the cyan colored type IV
-turn also contains a DSSP defined
-sheet while the face on the left-hand side of the blue colored
-turn contains a single-bridge
-sheet. As mentioned earlier, the type IV
-turns always appear, at a minimum, in pairs on a face. One obvious difference between Rfr32 and MfpA is that the two types of turns are clustered on individual faces of Rfr32 while they are mixed in the N-terminal half of MfpA. Perhaps the different arrangement of the turns is a mechanism for generating different surfaces on the faces of an Rfr-fold.
The Rfr-fold of MfpA has eight consecutive, complete helical turns with a prominent kink after the fourth helical turn that was attributed to a cis-proline between C4 and C5 on Face 4 (the turn before and after this residue is colored yellow in Fig. 7). The kink induces a 12° change in the helical axis of coils C1C4 and C5C8, which may be essential to its function as a DNA mimic, causing a sigmoidal shape across the head-to-head dimer (Hegde et al. 2005; Vetting et al. 2006). On the other hand, the Rfr-fold of Rfr32 contains no proline residues and no such kink, and therefore, Rfr32 may be unable to function as a DNA mimic.
Another small difference between the two structures is that the Rfr-fold in MfpA is more twisted than in Rfr32 with the twist most prominent before the kink (C1C4). Vetting et al. (2006) suggested that the twist may be driven by stacking interactions of the interior aromatic side chains that minimize the negative interaction between the
-electron clouds (Hunter et al. 1991). While such stacking interactions may contribute to the twist in the Rfr-fold, it likely is not the major contributor to the twist. Figure 8 highlights the position of the aromatic residues in the structures of Rfr32 and MfpA. The latter Rfr-fold contains two stacks, one of three (Face 2) and one of four (Face 3) aromatic residues, while Rfr32 contains three stacks, one of three (Face 1) and two of two (Faces 2 and 3) aromatic residues. Granted, the more extended aromatic stacks in MfpA may generate more twist, but, Rfr32 and the lower coils of MfpA both contain the same overall number of stacked aromatic residues, seven. A more likely reason for the twist is the difference in overall length of the pentapeptide repeat in a type IV
-turn versus a type II
-turn,
0.9 Å. In Figure 7 the two types of turns are colored cyan (type IV) and blue (type II), and clearly, there is a mixing of the turns in the lower coils of MfpA while in Rfr32 all the type IV
-turns are on Face 1. Consequently, the lower coils of MfpA will be more twisted than the coils of Rfr32.
Related to the difference in the length of the two
-turns and aromatic side chain stacking, Vetting et al. (2006) observed that phenylalanine residues predominated the i position of pentapeptide repeats adopting type IV
-turns. They suggested that the more extended type IV
-turn conformation better accommodated the bulky aromatic group than the type II
-turn conformation, as seven out of 10 aromatic residues were observed in type IV
-turns in the Rfr-fold of MfpA. However, the correlation may have been serendipitous, as only three out of the eight aromatic residues in the Rfr-fold of Rfr32 are observed in type IV
-turns.
The second
-helix at the C-terminal of MfpA interacts in an antiparallel fashion with the identical helix in a second molecule to form an intermolecular head-to-head dimer. Similar head-to-head dimers were observed in the four crystal forms that were characterized, and dimers were also observed to form in solution (Hegde et al. 2005; Vetting et al. 2006). In contrast, Rfr32 is a monomer in solution, as shown by size-exclusion chromatography and nuclear magnetic resonance spectroscopy (data not shown), and the packing in the two crystals of Rfr32 differed from MfpA such that the C-terminal
-helices of two molecules do not contact each other (data not shown). Therefore, if dimer formation is essential for MfpA to function as a DNA mimic, the absence of a similar dimer in Rfr32 may hint toward a distinct function in the luminal space of cyanobacteria.
Circular dichroism profile and thermal stability of cleaved Rfr32
Circular dichroism spectroscopy is a powerful tool to probe the conformation of proteins in solution (Woody 1974; Smith and Pease 1980) because small changes in the backbone conformation can cause strong changes in the CD spectrum (Manning et al. 1988). While the CD spectra of
-helices and
-sheets are well characterized, there remains some ambiguity regarding the pure components CD spectra of different types of
-turns (Perczel et al. 1992). Understanding the component contribution of
-turns to the CD spectrum is important because
-turns are a common structural motif, comprising up to 25% of the structure of all folded proteins and peptides (Kabsch and Sander 1983; Wilmot and Thornton 1988). One of the reasons for the ambiguity in the contribution of
-turns to CD spectra is the scarcity of proteins and model compounds that are purely one type of
-turn. The crystal structure of Rfr32 shows
75% of the protein forms a right-handed quadrilateral
-helix structure with 80% of the residues participating in two types of
-turns (75% type II and 25% type IV) with one face of the fold adopting a canonical parallel
-sheet structure and three faces of the fold forming single-bridge
-sheets (Fig. 1A). Consequently, the CD spectrum of Rfr32 should be dominated by
-turn and parallel
-sheet features.
A far-UV CD spectrum of Rfr32, obtained using a sample with the N-terminal 43-residues removed so that the spectrum was free of contributions from this section, is shown in Figure 9A. The spectrum is dominated by one feature, a minimum band at
216 nm with no distinct maximum band. As might be expected, this spectrum most closely resembles the pure component CD spectra for
-turns and parallel
-sheets (Perczel et al. 1992). Despite having two short
-helices at the C-terminal, the characteristic double minimum at 222 nm and 208210 nm and maximum between 190 nm and 195 nm (Holzwarth and Doty 1965) is buried under the major band. Once the correlation between pentapeptide sequence and
-turn type is better understood, by genetically modifying Rfr32 to remove the C-terminal
-helices and convert the type IV
-turns into type II
-turns, it may be possible to construct an Rfr-fold that is composed entirely of type II
-turns and obtain a CD spectrum even more dominated by this component. Alternatively, some of the other Cyanothece PRPs may natively be free of extraneous secondary structure and contain an Rfr-fold dominated by a single type of turn.
|
40°C, upon which a step decrease occurs up to 55°C, at which point a plateau is reached. The inflection point for the transition, which is nonreversible, is
48°C, and likely reflects the unraveling of the Rfr-fold.
Analysis of the PRP family in Cyanothece 51142
Table 4 lists the 35 PRPs identified in the genome of Cyanothece 51142. SOSUIsignal analysis (Gomi et al. 2004) of these PRP sequences predicts that seven will be located in the lumen/periplasm, nine in the plasma membrane, and the rest in the cytosol. While the proteins vary in size from 105 to 930 residues, 80% of them contain <400 amino acids. Analysis of their sequences predicts that they contain as few as 14 (Rfr33) and as many as 61 (Rfr01) pentapeptide repeats. Except for Rfr08, Rfr02, Rfr17, and Rfr16, all the pentapeptide repeats are tandem. Figure 10 graphically illustrates the predicted composition of the 35 Cyanothece 51142 PRPs in terms of pentapeptide repeat (red), N-terminal (dark blue), C-terminal (light blue), and other regions (white). For proteins with <400 residues, a predicted Rfr-domain constitutes >50% of the residues in all but three PRPs. For proteins with >400 residues, the predicted RFR-domain constitutes less than one-third of the protein, and in two of these larger proteins the pentapeptide repeats are not all tandem. As observed in the other cyanobacteria, the predicted Rfr-domain is located toward the C-terminal in the majority of the Cyanothece 51142 PRPs, especially in proteins that likely contain multiple domains (Vetting et al. 2006).
|
|
40% had an additional, non-Rfr, domain. The additional domains could be grouped into 20 categories, with only seven containing more than two members. The top three most populated domains were the WD40
-transducin repeat (56), Ser/Thr protein kinase (11), and tetratricopeptide_1 repeat (11) (Vetting et al. 2006). Table 4 indicates that 15 out of the 35 Cyanothece PRPs are >50% nonpentapeptide repeat, and consequently, could also potentially contain an additional domain. A BLAST study (Altschul et al. 1990) of the 35 Cyanothece PRPs indicates that only four contain an additional, identifiable domain as listed in Table 4. Two are Ser/Thr kinases, one a DnaJ domain, and the fourth a UvrD/REP helicase. The latter domain catalyzes ATP dependent unwinding of double-stranded DNA to single-stranded DNA, and was the only domain not identified in the study by Vetting et al. (2006). Given that very little is known about the biological function of PRPs, it may be that the diversity of protein sequences attached to some of the other Rfr-folds may have novel, uncharacterized folds and functions. For example, only 14% of Rfr33 is composed of pentapeptide repeats, and this protein is homologous to RfrA, a protein with an Rfr-domain that has been associated with manganese uptake in Synechocystis (Chandler et al. 2003).
In addition to perhaps performing a unique biological function, protein sequences that often straddle an Rfr-fold may also serve an additional function as stabilizers of the Rfr-fold. The top and bottom of the "naked" Rfr-fold contains exposed hydrogen donors and acceptors in position to form "edge-to-edge"
-bridges and
-sheets with another molecule. If the Rfr-fold really was naked, these ends could lead to edge-to-edge aggregation of the protein (Richardson and Richardson 2002). However, as shown in Figure 1B, the C-terminal of the Rfr-fold is capped by a pair of
-helices connected by a loop that sits on top of the edge of the last coil, nicely protecting three out of the four faces on the C-terminal edge of the Rfr-fold from forming edge-to-edge aggregates. At the N-terminal, the Rfr32 construct used for crystallization contains a large
5-kDa polypeptide tag that could prevent the N-terminal from forming edge-to-edge aggregates. Untagged Rfr32 contained seven residues prior to the first pentapeptide repeat that could perform the same function, and interestingly, these molecules packed N-terminal-to-N-terminal in the crystal. Without this tag, native Rfr32 contains a 29-residue signal sequence that may form a similar function at the N-terminal before the protein reaches its destination in the thykaloid lumen. Richardson and Richardson (2002) observed that
-sheet edge protection was common in the structures of other
-helical proteins. Perhaps the C-terminal
-helices in Rfr32 have no biochemical function except to prevent the C-terminal of the Rfr-fold from aggregating. MfpA also contains a pair of
-helices at the C-terminal that may perform a similar role, especially when they associate with another MfpA molecule to form a head-to-head dimer. Note that 10 of the 35 PRPs identified in Cyanothece have few, if any, predicted non-Rfr-fold residues at the C terminus (Fig. 10). It will be interesting to see if these PRPs without a C-terminal sequence are monomers, dimers, or higher order aggregates in solution.
Conclusions
The Rfr-fold is a special subset in the right-handed parallel
-helix family of protein structures, with at least 16 right-handed parallel
-helices having been observed and listed in the SCOP structure database (Murzin et al. 1995). Like the Rfr-fold, these
-helices also have coils with spacing of
4.8 Å (Jenkins and Pickersgill 2001). However, unlike the coils in the Rfr-fold where the sequence length of each coil is 20 residues, the sequence lengths of the coils in the other
-helices range from 30+ (Badger et al. 2005) to 12 (Liou et al. 2000) residues. Furthermore, even within the same
-helix, the length of the coil may vary in contrast to the consistent 20-residue length observed in each coil in the Rfr-fold. At least three different types of stacking occur in the interior of
-helicesaliphatic stacks, aromatic stacks, and polar stacks (Jenkins and Pickersgill 2001). Aromatic and aliphatic stacks are observed at the ith residue position of the pentapeptide repeat in the Rfr-fold, while aliphatic stacks are observed in the i 2 position. A major difference between the Rfr-fold and most of the other right-handed parallel
-helices is the number and length of the "faces" in the
-helix. The Rfr-fold contains four faces that vary by <1 Å in length, while the right-handed parallel
-helices have three or four faces with cross-sections that are triangular (Graether et al. 2000), square (Liou et al. 2000), rectangular (Badger et al. 2005), or even L-shaped (Emsley et al. 1996). While all right-handed parallel
-helices, including the Rfr-fold, contain parallel
-sheets, only the Rfr-fold contains linear stacked arrays of
-bridges aligned to form single-residue
-sheets. Interestingly, parallel
-sheets are more rigid than antiparallel
-sheets (Emberly et al. 2004), suggesting that the architecture of the Rfr-fold and all right-handed parallel
-helices may be especially sturdy, although the relatively low melting temperature determined for Rfr32 by CD spectroscopy contradicts this hypothesis. Amidst the diversity of shapes adopted by right-handed parallel
-helices, the Rrf-fold appears to be the only group of right-handed
-helices that may be readily predicted from the amino acid sequence. Analysis of the two available crystal structures of proteins containing pentapeptide repeats suggests that the structure of this sequence-identifiable right-handed
-helix, the Rfr-fold, is shaped by individual pentapeptide repeats adopting one of two turn motifs. These two turn motifs may be universal to the Rfr-fold in all 2110 PRPs in the Pfam database.
While the small family of right-handed parallel
-helix structures share many common features, there is also a lot of variation in the size and shape of these structures (Jenkins and Pickersgill 2001). Likewise, there is also variation in the known functions of these right-handed parallel
-helices, ranging from pectate lyases (Yoder et al. 1993) to hyperactive antifreeze proteins (Graether et al. 2000). On the other hand, because of the repetitive nature of the pentapeptide repeat sequence, it was predicted that the structures adopted by such tandem sequences would all be very similar (Bateman et al. 1998; Hegde et al. 2005). The second structure of a protein containing an Rfr-fold reported here, Rfr32, supports these predictions as there are striking similarities in overall architecture of the Rfr-fold in Rfr32 and MfpA. However, there are also differences, likely related to the sequential ordering of type II and type IV
-turns, which result in different twists to the
-helix and different surfaces exposed to the solvent. Differences on the solvent-exposed surface may also be introduced with "exceptions" to the general pentapeptide repeat motif. The Pfam definition of the pentapeptide repeat is A[D,N]LXX (Bateman et al. 2000); however, a more precise definition of the consensus sequence by Vetting et al. (2006) is [S,T,A,V][D,N][L,F][S,T,R][G]. The latter definition is not exclusive, since the Rfr-fold can tolerate exceptions, especially with regard to solvent-exposed residues in the i 1, i + 1, and i + 2 position in the pentapeptide repeat (see Fig. 2). Overall, such differences between Rfr-folds may result in variations to the biochemical functions of the Rfr-fold. Indeed, these differences are pronounced enough to suggest that Rfr32 may have a function different from the one proposed for MfpA.
There is convincing evidence that one biochemical function of PRPs expressed from bacterial plasmids is to provide resistance to fluoroquinolones and other antibiotics via a mechanism that involves DNA mimicry (Hegde et al. 2005; Vetting et al. 2006). These plasmid genes that convey antibiotic resistance likely developed from PRP genes present in chromosomal DNA as a special niche that originated secondarily to its primary role or function in cyanobacteria. However, little is known about the biochemical function of the pentapeptide repeat domain in these chromosomal PRP gene products, and nothing is known about their mechanism of action. Cyanobacteria are unique in the sheer number of proteins with predicted Rfr-domains. As observed in other cyanobacteria (Kieslebach et al. 1998), the 35 PRPs in Cyanothece 51142 are predicted to be located in all of the cellular compartments (Table 4), and only one of these compartments, the cytosol, contains DNA. These observations, sheer numbers, and disparate cellular locations argue for an important physiological function for PRPs (Kieslebach et al. 1998) in cyanobacteria that likely does not involve DNA mimicry. Further studies of the structure and biochemical function of proteins containing Rfr-folds are necessary in order to refine our emerging understanding of this intriguing family of proteins.
| Material and methods |
|---|
|
|
|---|
The acronym PRP is used throughout the manuscript to describe proteins containing pentapeptide repeats. However, the PPR acronym for the pentapeptide repeat motif conflicts with PPR nomenclature already used to define pentatrichopeptide repeat motifs in a large family of proteins in Arabidopsis thaliana (Small and Peteers 2000). Consequently, we are adopting the nomenclature first used by Chandler et al. (2003) to annotate the PRP genes from Synechocystis 6803, to annotate the PRP genes from Cyanothece 51142, repeated five-residues (Rfr). As a result, the 35 PRP genes from Cyanothece 51142 are annotated Rfr1 through Rfr35, based on their sequential position in the chromosome.
Cloning, expression, and purification
The Rfr32 gene minus the N-terminal 29 residues containing the signal peptide was amplified using the genomic DNA of Cyanothece sp. ATCC 51142 and the oligonucleotide primers 5'-ATCGAGGTCTCACATGGTCACTGGCTCCAGTGC-3' and 5'-TGACTGGTCTCCGAGCTATTGACATCGTAAGGACTCACGG-3' (Midland). The amplified Rfr32 gene, corresponding to Rfr32 residues V30Q167, was inserted into the NcoI/XhoI-digested expression vector pET30b (Novagen) such that a 43-residue tag containing six consecutive histidine residues was added to the N terminus of the gene product. The recombinant plasmid was transformed into E. coli BL21 (DE3) and methionine-auxotrophic B834 (DE3) cells (Novagen). The SeMet-substituted protein was expressed in the B834 (DE3) cells following an autoinduction protocol using minimal medium supplemented with 34 µg/mL kanamycin, 30 µg/mL chloramphenicol, and 200 µg/mL of each individual amino acid except for the inclusion of 10 µg/mL methionine and 125 µg/mL selenomethionine. After autoinduction at 25°C, the cells were harvested by centrifugation and frozen at 193 K. Thawed cells were resuspended in 32 mL lysis buffer (0.3 M NaCl, 50 mM sodium phosphate, 10 mM imidazole at pH 8.0) and brought to 0.2 µM in PMSF prior to three passes through a French press (SLM Instruments). Following sonication for 30 sec, the cell debris was spun at 25,000g for 1.5 h. After passage through a 0.45-µm syringe filter, the supernatant was loaded onto a 20 mL Ni-NTA affinity column (Qiagen) and washed stepwise with 50 mL of buffer (0.3 M NaCl, 50 mM sodium phosphate at pH
8.0) containing increasing concentrations of imidazole (5, 10, 20, 50, and 250 mM). The fraction containing Rfr32 eluted primarily with the 250 mM imidazole