|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Department of Biochemistry and Center for Eukaryotic Structural Genomics, Medical College of Wisconsin, Milwaukee, Wisconsin 53226, USA
Reprint requests to: Brian F. Volkman, Department of Biochemistry and Center for Eukaryotic Structural Genomics, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin 53226, USA; e-mail: bvolkman{at}mcw.edu; fax: (414) 456-6510.
(RECEIVED May 25, 2005; FINAL REVISION May 25, 2005; ACCEPTED May 31, 2005)
| Abstract |
|---|
|
|
|---|
-sheet arranged in an open barrel and two short
-helices, one at each end of the barrel. While At1g16640 is quite distinct from previously characterized B3 domain proteins in terms of amino acid sequence similarity, it adopts the same novel fold that was recently revealed by the RAV1 B3 domain structure. However, putative DNA-binding elements conserved in B3 domains from the RAV, ARF, and ABI3/VP1 subfamilies are largely absent in At1g16640, perhaps suggesting that B3 domains could function in contexts other than transcriptional regulation. Keywords: B3 domain; NMR; protein structure; structural genomics; bioinformatics
Article published online ahead of print. Article and publication date are at http://www.proteinscience.org/cgi/doi/10.1110/ps.051606305.
| Introduction |
|---|
|
|
|---|
As part of a structural genomics effort directed at eukaryotic proteins, the At1g16640 protein was selected from the A. thaliana genome as a target likely to reveal novel structural information. At1g16640 was predicted in the Pfam database to contain a DNA-binding domain unique to higher plant species (Bateman et al. 2004). This motif, called the B3 domain (Pfam accession 02362), has been characterized in a number of plant transcription factors. The first B3 domains were identified in the proteins Abscisic Acid-Insensitive 3(ABI3) from Arabidopsis and Viviparous 1 (VP1) from Zea Mays (Giraudat et al. 1992). Since then, many B3 domain-containing proteins have been classified functionally as factors responsive to abscisic acid and auxin, phytohormones that play critical roles in developmental processes such as plant growth and seed maturation (McCarty et al. 1989; Ulmasov et al. 1997). Three major classes of transcription factors containing B3 domains have been identified to date, including factors resembling ABI3 and VP1 (ABI3/VP1- like factors), proteins similar to the Arabidopsis protein, RAV1 (RAV-like family), and auxin response factors (ARFs) (Riechmann et al. 2000). These B3 domains bind to specific DNA sequences six base pairs in length (Suzuki et al. 1997; Ulmasov et al. 1997; Kagaya et al. 1999). The recognition sequences are conserved among members of the same family, but differ between the three identified families. When At1g16640 was selected as a target, no B3 domain structures were reported; since then the first B3 domain structure (from the Arabidopsis protein RAV1) was determined by NMR (Yamasaki et al. 2004). Based on ambiguous screening results by 2D NMR, the full-length At1g16640 protein (134 residues) was judged an unsuitable target for structure determination and dropped from the production pipeline. Promising aspects of the initial NMR data and the potential value of a B3 domain structure led us to consider methods for salvaging high-priority targets that may contain folded domains but fail due to aggregation, insolubility, or other problems caused by various portions of the protein.
In this report we show that a bioinformatic approach, combined with experimental screening of a modest number of expression constructs, can be used to rescue proteins that would be otherwise unsuitable for structure determination. Using this approach, we identified a stable, folded domain in the A. thaliana protein At1g16640, a structural genomics target rejected at the HSQC screening stage as a full-length protein. Inspection of the structure of the optimal At1g16640 B3 domain construct determined by NMR spectroscopy reveals a highly conserved tertiary fold.
| Results and Discussion |
|---|
|
|
|---|
|
Expression screening
We compared protein expression levels for each At1g16640 domain construct in E. coli at 15°C and 37°C (Fig. 1B
). Total protein and the fraction of protein in the soluble cell lysate were assessed by SDS-PAGE and found to vary significantly. These differences arose not only between constructs, but were also dependent upon the expression vector used and the temperature at which the proteins were induced, with no obvious pattern.
Interestingly, removal of the N-terminal affinity tag with TEV protease was successful only for the fusion proteins that included the At1g16640Nterminus (192, 1102, and 1112), but not for the 8102 or 8112 versions. Structural results presented below reveal that Val 7 is the initial residue of the first
-strand of the B3 domain, suggesting that the cleavage site may have been sequestered by secondary structure in the N-terminally truncated constructs.
HSQC screening
A comparison of 2D 15N1H HSQC spectra of the truncated At1g16640 constructs revealed significant differences (Fig. 1C
). We evaluated the spectra based on the number of signals, chemical shift dispersion, and the uniformity of the peak intensities and linewidths. One sample, the 192 construct, precipitated heavily, precluding NMR analysis. The At1g16640 1102 construct produced the best HSQC spectrum, with good peak dispersion and uniform peak intensity. Spectral features in the HSQC of the full-length protein (Fig. 1A
) consistent with the presence of disordered residues and aggregation were eliminated. Thus, with a small set of constructs of At1g16640 designed around the predicted B3 DNA-binding domain, we isolated the folded portion of the protein and obtained a more uniform HSQC spectrum.
Structure determination
We determined the structure of the optimized At1g16640 B3 domain corresponding to residues 1102 by NMR spectroscopy, using an automatic iterative NOE refinement method to obtain a consistent set of experimental constraints. The final NMR structure ensemble is shown in Figure 2
, and structural statistics are summarized in Table 1
. The structure reveals a compact seven-stranded
-barrel-like topology with a short
-helix near each end. The B3 domain of At1g16640 thus adopts the same novel fold as that first observed in the recently reported RAV1 B3 domain structure (Yamasaki et al. 2004).
|
|
|
In terms of amino acid sequence, At1g16640 is strikingly divergent from other classes of B3 domains, which display high sequence conservation. B3 domains within the ARF class are 72% identical on average. Likewise, RAV-like and ABI3/VP1-like proteins average 64% identity within their subfamilies. RAV1 and At1g16640, despite their structural similarity, share only 26% sequence identity. At1g16640, in fact, shows similarly weak homology to the other two classes, sharing only approximately 22% and 20% identity with ARFs and ABI3/VP1- like proteins, respectively (Poirot et al. 2004). Thus, based on sequence similarity, At1g16640 is unlikely to be categorized as a member of any of these B3 protein subfamilies.
Although At1g16640 appears to be quite distinct from the RAV, ARF, and ABI3/VP1 classes of B3 domains, these subfamilies represent only a fraction of B3-containing proteins. The B3 superfamily currently includes 363 members from various plant species, grouped into 16 distinct structural architectures based on their association with other conserved domain combinations (Bateman et al. 2004). Unlike most well-defined B3 proteins, At1g16640 contains only one identifiable domain. By comparison, RAV1 contains an additional DNA-binding motif of the AP2/ERF-type, and most ABI3/VP1- and ARF-like proteins contain additional protein interaction or dimerization domains (Yamasaki et al. 2004). Presumably, the accompanying domains contribute to the biological activity of these transcription factors and in their absence At1g16640 may function quite differently.
Conclusions
In determining the structure of the At1g16640 B3 domain, we have shown that bioinformatic analysis and 2D NMR screening of a small panel of truncated protein constructs can be used to salvage failed structural genomics targets. Our results present the second structure of a B3 domain and show that this novel fold is highly conserved among family members, despite relatively low sequence conservation. The At1g16640 protein has not been shown to bind DNA. Compared to RAV1 and other B3 proteins that bind DNA, At1g16640 has a less electropositive surface, lacks conserved putative DNA-binding residues and possesses no additional recognizable interaction domains. Thus, we hypothesize that At1g16640 may not participate in transcriptional regulation, but instead represents a distinct functional class of B3 domains.
| Materials and methods |
|---|
|
|
|---|
Protein expression
Plasmids were transformed into E. coli strain SG13009[pREP4] (Qiagen) for expression. Cells were grown in 25 mL LB media containing 150 µg/mL ampicillin and 50 µg/mL kanamycin at 37°C until reaching a cell density of A600=0.6. Isopropyl-
-D-thiogalactopyranoside was then added to a final concentration of 1 mM to induce expression of the proteins. Upon induction, the cultures were split into two equal parts and grown at both 37 and 15°C. One-milliliter samples were taken 2.5 and 5 h post-induction and 5 h and 24 h post-induction for the cultures at 37°C and 15°C, respectively. The samples were harvested, sonicated, and analyzed for protein expression and solubility by SDS-PAGE. After selecting the proper expression conditions for each construct, isotopically-labeled proteins were prepared for NMR by growing 1-L cultures of protein in M9 media containing 15N-ammonium chloride and/or 13C-glucose as the sole nitrogen and carbon sources, respectively.
Protein purification
Cells harvested from a 1-L culture were lysed using a French pressure cell and purified by metal affinity chromatography according to a previously published protocol (Lytle et al. 2004). Following purification, the protein solutions were each dialyzed into 2 x 4 L of 20 mM sodium phosphate at pH 7.0, 50 mM sodium chloride. The resulting purified proteins were then concentrated to 500 µL for analysis by NMR, and the identity and purity of the proteins were verified by SDS-PAGE.
NMR spectroscopy
NMR samples were prepared in buffers containing 20 mM sodium phosphate at pH 7.0, 50 mM sodium chloride, and 5% 2H2O. Soluble domain constructs were screened by 15N1H HSQC using samples containing ~0.20.5 mM U-15N protein, and the sample used for structure determination of At1g16640 1102 contained ~1 mM U-13C/15N protein. All NMR data were acquired at 25°C on a Bruker 600 MHz spectrometer equipped with a triple-resonance Cryo- Probe and processed with NMRPipe software (Delaglio et al. 1995). The total acquisition time for all NMR spectra was ~280 h. Over 90% of the backbone 1H, 15N, and 13C resonance assignments were obtained in an automated manner using the program Garant (Bartels et al. 1996), with peaklists from 3D HNCO, HNCACO, HNCA, HNCOCA, HNCACB, and CCONH spectra generated manually with XEASY (Bartels et al. 1995) or automatically with SPSCAN. Side chain assignments were completed manually from 3D HCCONH, HCCH-TOCSY, and 13C(aromatic)-edited NOESY-HSQC spectra.
Structure determination
Distance constraints were obtained from 3D 15N-edited NOESY-HSQC and 13C-edited NOESY-HSQC spectra (
mix=80 msec). Backbone
and
dihedral angle constraints were generated from secondary shifts of the 1H
, 13C
, 13C
, 13C', and 15N nuclei using the program TALOS (Cornilescu et al. 1999). Structures were generated in an automated manner using the CANDID module of the torsion angle dynamics program CYANA (Herrmann et al. 2002), which produced an ensemble with high precision and low residual constraint violations that required minimal manual refinement. The 20 CYANA conformers with the lowest target function were subjected to a molecular dynamics protocol in explicit solvent (Linge et al. 2003) using XPLOR-NIH (Schwieters et al. 2003).
| Accession numbers |
|---|
|
|
|---|
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
Bartels, C., Billeter, M., Güntert, P., and Wüthrich, K. 1996. Automated sequence-specific NMR assignments of homologous proteins using the program GARANT. J. Biomol. NMR 7: 207213.
Bateman, A., Coin, L., Durbin, R., Finn, R.D., Hollich, V., Griffiths-Jones, S., Khanna, A., Marshall, M., Moxon, S., Sonnhammer, E.L., et al. 2004. The Pfam protein families database. Nucleic Acids Res. 32: D138D141.
Cornilescu, G., Delaglio, F., and Bax, A. 1999. Protein backbone angle restraints from searching a database for chemical shift and sequence homology. J. Biomol. NMR 13: 289302.[CrossRef][Medline]
Delaglio, F., Grzesiek, S., Vuister, G.W., Zhu, G., Pfeifer, J., and Bax, A. 1995. NMRPipe: A multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6: 277293.[Medline]
Delano, W.L. 2002. The PyMOL molecular graphics system. DeLano Scientific, San Carlos, CA.
Gibrat, J.F., Madej, T., and Bryant, S.H. 1996. Surprising similarities in structure comparison. Curr. Opin. Struct. Biol. 6: 377385.[CrossRef][Medline]
Giraudat, J., Hauge, B.M., Valon, C., Smalle, J., Parcy, F., and Goodman, H.M. 1992. Isolation of the Arabidopsis ABI3 gene by positional cloning. Plant Cell 4: 12511261.
Herrmann, T., Güntert, P., and Wüthrich, K. 2002. Protein NMR structure determination with automated NOE assignment using the new software CANDID and the torsion angle dynamics algorithm DYANA. J. Mol. Biol. 319: 209227.[CrossRef][Medline]
Huth, J.R., Bewley, C.A., Jackson, B.M., Hinnebusch, A.G., Clore, G.M., and Gronenborn, A.M. 1997. Design of an expression system for detecting folded protein domains and mapping macromolecular interactions by NMR. Protein Sci. 6: 23592364.[Abstract]
Kagaya, Y., Ohmiya, K., and Hattori, T. 1999. RAV1, a novel DNA-binding protein, binds to bipartite recognition sequence through two distinct DNA-binding domains uniquely found in higher plants. Nucleic Acids Res. 27: 470478.
Koradi, R., Billeter, M., and Wüthrich, K. 1996. MOLMOL: A program for display and analysis of macromolecular structures. J. Mol. Graph. 14: 5155.[CrossRef][Medline]
Linge, J.P., Williams, M.A., Spronk, C.A., Bonvin, A.M., and Nilges, M. 2003. Refinement of protein structures in explicit solvent. Proteins 50: 496506.[CrossRef][Medline]
Lytle, B.L., Peterson, F.C., Qiu, S.H., Luo, M., Zhao, Q., Markley, J.L., and Volkman, B.F. 2004. Solution structure of a ubiquitin-like domain from tubulin-binding cofactor B. J. Biol. Chem. 279: 46787 46793.
McCarty, D.R., Carson, C.B., Stinard, P.S., and Robertson, D.S. 1989. Molecular analysis of viviparous-1: An abscisic acid-insensitive mutant of maize. Plant Cell 1: 523532.
Poirot, O., Suhre, K., Abergel, C., OToole, E., and Notredame, C. 2004. 3DCoffee{at}igs: A web server for combining sequences and structures into a multiple sequence alignment. Nucleic Acids Res. 32: W37 W40.
Riechmann, J.L., Heard, J., Martin, G., Reuber, L., Jiang, C., Keddie, J., Adam, L., Pineda, O., Ratcliffe, O.J., Samaha, R.R., et al. 2000. Arabidopsis transcription factors: Genome-wide comparative analysis among eukaryotes. Science 290: 21052110.
Schwieters, C.D., Kuszewski, J.J., Tjandra, N., and Clore, G.M. 2003. The Xplor-NIH NMR molecular structure determination package. J. Magn. Reson. 160: 6573.[CrossRef][Medline]
Suzuki, M., Kao, C.Y., and McCarty, D.R. 1997. The conserved B3 domain of VIVIPAROUS1 has a cooperative DNA binding activity. Plant Cell 9: 799807.[Abstract]
Tyler, R.C., Aceti, D.J., Bingman, C.A., Cornilescu, C.C., Fox, B.G., Frederick, R.O., Jeon, W.B., Lee, M.S., Newman, C.S., Peterson, F.C., et al. 2005a. Comparison of cell-based and cell-free protocols for producing target proteins from the Arabidopsis thaliana genome for structural studies. Proteins 59: 633643.[CrossRef][Medline]
Tyler, R.C., Sreenath, H.K., Singh, S., Aceti, D.J., Bingman, C.A., Markley, J.L., and Fox, B.G. 2005b. Auto-induction medium for the production of [U-15N]- and [U-13C, U-15N]-labeled proteins for NMR screening and structure determination. Protein Expr. Purif. 40: 268 278.[Medline]
Ulmasov, T., Hagen, G., and Guilfoyle, T.J. 1997. ARF1, a transcription factor that binds to auxin response elements. Science 276: 18651868.
Yamasaki, K., Kigawa, T., Inoue, M., Tateno, M., Yamasaki, T., Yabuki, T., Aoki, M., Seki, E., Matsuda, T., Tomo, Y., et al. 2004. Solution structure of the B3 DNA binding domain of the Arabidopsis cold-responsive transcription factor RAV1. Plant Cell 16: 34483459.
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
P. L. Hayes, B. L. Lytle, B. F. Volkman, and F. C. Peterson The solution structure of ZNF593 from Homo sapiens reveals a zinc finger in a predominately unstructured protein Protein Sci., March 1, 2008; 17(3): 571 - 576. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |