|
|
||||||||
1 Department of Chemistry, University of Nebraska-Lincoln, Lincoln, Nebraska 68588, USA
2 Department of Microbiology and Immunology, Institute for Computational Biomedicine, and Northeast Structural Genomics Consortium, Weill Medical College of Cornell University, New York, New York 10021, USA
3 Department of Biochemistry and Molecular Biophysics, and Northeast Structural Genomics Consortium, and 4 Howard Hughes Medical Institute, Columbia University, New York, New York 10032, USA
5 Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, and Northeast Structural Genomics Consortium, Rutgers University, Piscataway, New Jersey 08854, USA
6 Biological Sciences Division, and Northeast Structural Genomics Consortium, Pacific Northwest National Laboratory, Richland, Washington 99352, USA
7 Department of Biochemistry, Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey, Piscataway, New Jersey 08854, USA
Reprint requests to: Robert Powers, Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE 68588, USA; e-mail: rpowers3{at}unl.edu; fax: (402) 472-2044.
(RECEIVED June 21, 2005; FINAL REVISION August 4, 2005; ACCEPTED August 21, 2005)
| Abstract |
|---|
|
|
|---|
-helices and a mixed
-sheet consisting of four parallel and anti-parallel
-strands, where the
-helices sandwich the
-sheet. Sequence and structural comparison of AF2095 with proteins from Homo sapiens, Methanocaldococcus jannaschii, and Sulfolobus solfataricus reveals that AF2095 is a peptidyl-tRNA hydrolase (Pth2). This structural comparison also identifies putative catalytic residues and a tRNA interaction region for AF2095. The structure of AF2095 is also similar to the structure of protein TA0108 from archaea Thermoplasma acidophilum, which is deposited in the Protein Data Bank but not functionally annotated. The NMR structure of AF2095 has been further leveraged to obtain good-quality structural models for 55 other proteins. Although earlier studies have proposed that the Pth2 protein family is restricted to archeal and eukaryotic organisms, the similarity of the AF2095 structure to human Pth2, the conservation of key active-site residues, and the good quality of the resulting homology models demonstrate a large family of homologous Pth2 proteins that are conserved in eukaryotic, archaeal, and bacterial organisms, providing novel insights in the evolution of the Pth and Pth2 enzyme families. Keywords: NMR; Archaeglobus fulgidis; protein AF2095; solution structure; peptidyl-tRNA hydrolase Pth2; Pth2 evolution
Abbreviations: 3D, three dimensional AES, N-terminal enhancer of split Bit1, Bcl-2 inhibitor of transcription 1 DTT, dithiothreitol HNHA, amide proton to nitrogen to C
H proton HSQC, hetero-nuclear single quantum coherence spectroscopy MES, 2-[N-morpholino]ethanesulfonic acid NCBI, National Center for Biotechnology Information NMR, nuclear magnetic resonance NR, nonredundant NOE, nuclear Overhauser effect NOESY, nuclear Overhauser enhancement spectroscopy Pth, peptidyl tRNA hydrolase PSSM, position-specific scoring matrix
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.051666705.
| Introduction |
|---|
|
|
|---|
/
fold with a twisted mixed
-sheet (Schmitt et al. 1997). This E. coli Pth structure was determined to be similar to an amino-peptidase, which permitted the mapping of the enzymes active site to a catalytic triad composed of residues His 20, Asp 93, and His 113 (Goodall et al. 2004).
Although Pth is essential for viability of bacteria (Menez et al. 2002) and present in most eukaryotes, a number of archaea genomes contain no recognizable Pth homologs (Rosas-Sandoval et al. 2002; Fromant et al. 2003). Recently, unique Pth (Pth2) activities have been identified in archaea Methanocaldococcus jannaschii and the thermophilic archaea Sulfolobus solfataricus. These enzymes, MJ0051 from M. jannaschii (Y051_METJA) and SSO0175 from S. solfataricus (Y175_SULSO), lack any sequence similarity with the well-characterized bacterial Pth enzyme class (Schmitt et al. 1997; de Pereda et al. 2004). The recently determined X-ray crystal structure of human Pth2 (Protein Data Bank [PDB] ID 1q7s
[PDB]
; de Pereda et al. 2004), the first 3D structure of a Pth2 enzyme, reveals that in addition to the lack of sequence similarity between Pth and Pth2, these enzymes also exhibit completely different folds. The Pth2 structure resembles the thioredoxin family fold, but exhibits E. coli tRNALys hydrolase activity. The active site of human Pth2 was assigned to residues in the
1
1 loop and
3
4 loop based on the close spatial proximity of conserved residues of Pth2 enzymes. Despite this lack of sequence and structural similarity between Pth and Pth2, the biological activities of the two enzyme classes are complimentary. Indeed, E. coli strains lacking endogenous Pth are rescued by the expression of Pth2 (Rosas-Sandoval et al. 2002; Fromant et al. 2003). Eukaryotes contain both Pth and Pth2 enzymes, where Pth2 enzymes are thought to be involved in mitochondrial protein synthesis (Rosas-Sandoval et al. 2002; Jan et al. 2004; Stupack and Cheresh 2004). In addition, a Pth/ Pth2 double mutant in Saccharomyces cerevisiae is viable, suggesting functional redundancy of Pth enzymes with some other eukaryotic enzyme(s).
Previously reported sequence analysis based on the Pth2 enzymes from human, archaea M. jannaschii, and thermophilic archaea S. solfataricus indicated that the Pth2 enzyme class was identified in numerous archaea and eukaryote organisms but was not present in bacteria (Rosas-Sandoval et al. 2002; Fromant et al. 2003; de Pereda et al. 2004). In addition, these enzymes were identified as members of the unannotated UPF0099 Pfam family, which only contained archaeal and eukaryotic proteins at the time of this analysis.
The human Pth2 enzyme is also known as Bcl-2 inhibitor of transcription 1 (Bit1). Bit1 is a mitochondrial protein that regulates apoptosis in a caspase-independent mechanism (Jan et al. 2004; Stupack and Cheresh 2004). The enzyme is located in the outer mitochondrial membrane, where, in the absence of integrin-mediated cell attachment, it migrates to the cytosol. Apoptosis is then initiated when Bit1 forms a complex with N-terminal enhancer of split (AES), which is a member of the Groucho family of transcriptional co-repressors. The exact details of the Bit1-mediated apoptosis pathway and the role of its Pth activity, are not well understood.
The thermophilic archaea Archaeglobus fulgidis AF2095 protein is an example of a protein of unknown biological function targeted for structural analysis by the Northeast Structural Genomics Consortium (NESG; http://www.nesg.org) (Liu et al. 2004; Wunderlich et al. 2004), which also belongs to the previously unannotated Pfam family UPF0099. In this article, we describe the 3D solution NMR structure of AF2095, together with the structural and functional insights provided by this structure. Comparison of the 3D NMR structure of A. fulgidis AF2095 and the 1.95 Å X-ray crystal structure of the functionally unannotated thermophilic archaea T. acidophilum protein TA0108 that has recently been deposited in the PDB (ID 1RLK [PDB] ) with the human Pth2 X-ray structure (de Pereda et al. 2004) suggests that these proteins are also Pth2 enzymes. Although it had been proposed that the Pth2 family has no bacterial members (Rosas-Sandoval et al. 2002; Fromant et al. 2003; de Pereda et al. 2004), assessments of homology models based on the 3D structure of AF2095 reported here demonstrate a wide phylogenetic distribution of the Pth2 enzyme class, including many bacterial proteins, consistent with the recent inclusion of bacterial members in Pfam family UPF0099. The structural and functional analysis also provide some insight in the evolution of the Pth and Pth2 enzyme families.
| Results and Discussion |
|---|
|
|
|---|
,
plot, 15.9% lie in the additionally allowed region, and 1.9% lie in the generously allowed region.
|
|
/
fold composed of a twisted, mixed four-stranded
-sheet (II-I-IV-III) with two
-helices packed against each surface of the
-sheet. In one case, the two
-helices may be viewed as one continuous helix with an <90° bend that separates the two helical regions (Fig. 1
-sheet, the three primary
-helices form a triangular arrangement with the
-strands sandwiched between the helices. The four stranded
-sheet corresponds to residues 410 (
1), 4954 (
2), 7578 (
3), and 9298 (
4) and the four helical regions correspond to residues 1736 (
1), 3945 (
2), 5872 (
3), and 101108 (
4). There are two distinct regions of the AF2095 structure that are suggestive of being dynamic and disordered as evident by a lack of chemical shift assignments (Powers et al. 2004). The C-terminal region of AF2095 that includes residues P112H123 appears to be disordered. This region of the protein also includes the LEHHHHHH purification tag. It is also interesting to note that this region of the protein includes four closely spaced leucine residues (L111, L113, L114, L116). Early on in the data collection and resonance assignments for AF2095, the NMR data quality was unexpectedly low for a protein the size of AF2095. A significant improvement in spectral quality was obtained with elevated temperatures, and the subsequent observation of disorder in the partially hydrophobic C-terminal tail appears to suggest a weak aggregation of the protein through an association of the C-terminal region of the protein. The improvement in the quality of the NMR data with a higher temperature is also consistent with the thermophilic nature of A. fulgidis and is probably closer to the functional conditions for which this protein has evolved.
The loop region consisting of residues Q79I91, which connects
-strands
3 and
4, also appears to be dynamic and disordered in nature (Fig. 1
). Residues L83V86 have limited or a complete lack of NMR resonance assignments and, as a result, an absence of structural information. The lack of NMR assignments for residues L83V86 arises because of a complete absence of any peaks that could be attributed to these residues in any NMR experiment, including the 2D 1H-15N HSQC spectra. There is no evidence to suggest that the lack of assignments could be attributed to peak overlap (Powers et al. 2004). The residues neighboring L83V86 also exhibit generally weaker peak intensity and minimal structural NOEs.
Analysis of the surface structure of AF2095 using GRASP reveals a distinct cavity with a positive electrostatic potential suggestive of a potential binding site (Fig. 2
). The AF2095 cavity is bordered by helix
1 and
-strand
2, and the base of the cavity is primarily formed by residues I91V95 and residues L7R10. These residues correspond to
-strands
4 and
1 and the intervening loop that connects the two
-strands.
|
Structural homologs of AF2095
The recent X-ray crystal structure of human Pth2 (YCE7_HUMAN; PDB ID 1q7s
[PDB]
), also called Bcl-2 inhibitor of transcription 1 (Bit1), indicates that Pth2 enzymes adopt a novel protein fold (de Pereda et al. 2004). Even though eukaryotic/archaeal Pth2 and bacterial Pth (Schmitt et al. 1997) protein structures have very different folds, it has been proposed that Pth and Pth2 have essentially the same catalytic mechanism as do other
/
-fold hydrolases, a catalytic triad composed of a nucleophile, a basic, and an acidic amino acid. This is supported by the observation that human Pth2 exhibits catalytic activity in cleaving E. coli purified diacetyllysyl-tRNALys to produce free diacetyl-lysine and tRNA. A proposed tRNA interaction region of human Pth2 was inferred from a nonuniform distribution of a positive electrostatic surface potential comprising the
1
1 loop, the
3
4 loop, and
1. Residues involved in the catalytic activity of human Pth2 were inferred from conserved and spatially proximal surface residues that form a potential active site.
AF2095 shares 36% sequence identity with human Pth2 and also shares 44% sequence identity to conserved hypothetical protein TA0108 from T. acidophilum (Y108_THEAC; PDB ID 1rlk
[PDB]
). Both the human Pth2 and TA0108 sequences were identified in the PSI-BLAST searches seeded with the AF2095 sequence. The 1.95 Å X-ray structure for TA0108 has been determined by the Midwest Center for Structural Genomics Research and is currently unpublished. The reported 2.0 Å X-ray structure for human Pth2 corresponds to residues 66179 of the complete protein sequence, where residues 165 have been tentatively assigned to a localization signal in eukaryotes. Structural alignment of AF2095 with both TA0108 and human Pth2 in PrISM (Yang and Honig 1999) yields 3.8 Å and 3.7 Å RMSD, respectively (Fig. 3
). The largest deviations in tertiary structure between AF2095 and human Pth2 occur in the
1
1 loop, the
3
4 loop, and the N terminus of
1, which have different relative orientations in these structures.
|
3
4 loop in AF2095 lacks any experimental data to define its relative conformation, the distinct orientation and packing of the
1-
2 region is defined by 156 long-range NOEs and 363 short-range NOEs. These NOEs are equally distributed over the length of the helices and are all satisfied in the AF2095 NMR structure. Similarly, the
1
2 region is also consistent with
/
torsion angles based on 13C
/C
chemical shifts as defined by TALOS (Cornilescu et al. 1999). Specifically, the carbon chemical shifts and TALOS predict a modest deviation from typical
/
-helical values between residues K34 and D36. Additional validation for the proper packing of the AF2095 NMR structure is provided by MolProbity analysis of bad contacts (Lovell et al. 2003). The packing violations observed for AF2095 are consistent with good-quality NMR structures. In addition, the violations are distributed throughout the 3D structure and are not concentrated in the
1
1 loop, the
3
4 loop, or the N terminus of
1, which differ from those of the human Pth2 and TA0108 structures. Presumably, if there was a packing error in these regions of the AF2095 structure, there would be a concentration of bad contacts identified by the MolProbity analysis (see Supplemental Material).
The sequence-based alignment between AF2095 and human Pth2 indicates a three-residue deletion in the region of
1 and
2. The shorter length of this helical region in AF2095 may explain the different orientation of the N terminus of
1 compared with that of human Pth2. The N terminus of
1 in AF2095 may be displaced to compensate for this three-residue deletion and to allow for the proper alignments of
1 and
4 with the remainder of the
-sheet. Also, residues that comprise the disordered
3
4 loop in AF2095 are proximal to the N terminus of
1. This interaction may contribute to the displacement of
1, especially since the average conformation of the disordered loop in the AF2095 NMR structure is distinct from the human Pth2 X-ray structure. Further contributing to these observed structural differences, may be the fact that the AF2095 NMR structure was determined at an elevated temperature (40°C).
The proposed active site residues of human Pth2 are conserved in AF2095; however, due to the structural differences in the two proteins, particularly the dynamic loop conformations, these residues are not well aligned in the structures. Superposition of the AF2095 NMR structure with human Pth2 and TA0108 only aligns D80 with the corresponding acid residues, where T90 from AF2095 actually aligns with the basic catalytic residues. Again, these putative functional residues correspond to a dynamic and disordered loop region (Q79I91) in the AF2095 structure, so a low alignment with the Pth2 X-ray structures is not surprising. Based on the conservation of the putative functional residues of human Pth2 in these three proteins and in the AF2095 family, the active-site residues for AF2095 and TA0108 can be inferred (Table 2
).
|
3
4 loop and the poor alignment of the catalytic triad may imply a conformational adjustment or stabilization of the AF2095 structure when the protein binds a peptidyl-tRNA. A conformational change or an induced fit upon the binding of a ligand in the enzymes active site is a common occurrence (Burkhard et al. 1999; Hynson et al. 2004; Romanowski et al. 2004; Venkitakrishnan et al. 2004). Enzymes that bind tRNA exhibit similar ligand-induced conformational changes upon binding tRNA (Crepin et al. 2003; Zhang et al. 2003; Phannachet and Huang 2004). In fact, the binding of tRNA to aminoacyl-tRNA synthetase is required to properly arrange the active site for the catalytic activity of these enzymes (Francklyn et al. 2002; Sherlin and Perona 2003). A similar requirement may also exist for AF2095 and other Pth2 enzymes. Thus, the differences observed between the alignment of the NMR structure of AF2095 with the X-ray structures of TA0108 and human Pth2 may simply reflect changes in the dynamic behavior of the proteins in the solution and solid state. Again, significant differences in the dynamic behavior of proteins have been observed when comparing NMR and X-ray structures (Powers et al. 1993; Moy et al. 1997, 1998).
Similar to TA0108 and human Pth2, the surface of AF2095 exhibits a positive electrostatic potential, consistent with a nucleic acid binding function. In addition, all three proteins have a negatively charged cluster in the vicinity of the proposed active site surface (Fig. 2B
). The sequence and structure homology between human Pth2 and AF2095 implies that AF2095 functions as a Pth2 enzyme. Furthermore, it also implies that protein TA0108 from T. acidophilum, which is currently defined as a conserved hypothetical protein, is also a Pth2 enzyme.
Leverage analysis of AF2095
NESG prioritizes protein targets for structure elucidation by targeting sequence clusters (Liu et al. 2004; Wunderlich et al. 2004). In this manner, experimental structural information that is obtained for any representative protein in a given cluster can be leveraged to predict a quality structural model for the remaining protein members of the cluster by means of homology modeling. AF2095 has been assigned to NESG Cluster ID 17431, which contains a total of 14 proteins from human, Drosophila, Caenorhabditis elegans, Arabidopsis, yeast, archaeal, and eubacteria. When the structural analysis of AF2095 was initiated, no experimental structures were available for any members of this NESG cluster.
PSI-BLAST searches with AF2095 as the initial query against the nonredundant (NR) database were performed, which identified 67 significant hits. A 68th protein (hypothetical protein Mbur208901) was identified by inspecting the multiple sequence alignment of MJ0051 homologs (Rosas-Sandoval et al. 2002). Essentially, all the sequences identified to be homologous to AF2095 are annotated as proteins of unknown function (see Supplemental Table 1S). Thus, the NMR structure for AF2095 reported herein may be leveraged to obtain both structural and functional information for an additional 68 proteins.
An effort was made to validate homologs identified by PSI-BLAST by evaluating the quality of the resulting homology models (see Supplemental Table 1S). After removal of human Pth2, TA0108, and two proteins with >95% sequence identity with AF2095 from the leverage list, MODELLER (Sanchez and Sali 1998a) yielded good models based on the normalized ProsaII Z-scores (pG > 0.7) (Sippl 1993) for 63 sequences. Removing likely isoforms or polymorphisms yielded a revised leverage list of 53 high-quality and sequence-unique models for the AF2095 NMR structure.
For two sequences (EAA17753 [GenBank] and T05298 [GenBank] ) that are significantly similar to AF2095 (e-values of 1e28 and 8e18 after the last round of PSI-BLAST, respectively), initial models obtained were unreliable with pG scores of 0.55 and 0.00, respectively. In such cases, the alignment most likely contains errors. As an example, T05298 [GenBank] has a 42-residue insertion (region 144185) that is not covered by the AF2095 template. Removal of the 42-residue insertion region and modeling of the remainder of the sequence provided a relatively reliable model for T05298 [GenBank] (pG score, 0.66). For EAA17753 [GenBank] , a reliable model (pG score 0.78) could be made by using 1q7s and 1rlk as additional templates to the AF2095 NMR structure. Therefore, we estimate that the leverage for the AF2095 NMR structure is at least 55 sequences when EAA17753 [GenBank] and T05298 [GenBank] are included. The total increases to 65 with the inclusion of highly similar sequences.
Previous sequence alignments, which did not include the analysis of the quality of predicted homology models, have been reported by using human Pth2 (de Pereda et al. 2004) or archaeal Pth2 proteins from M. jannaschii (Rosas-Sandoval et al. 2002) and S. solfataricus (Fromant et al. 2003). A significant observation reported from these prior alignment efforts indicated that Pth2 homologs were not present in bacteria. Conversely, the AF2095 leverage list contains proteins from archaea, eukaryotes, and bacteria proteins from Corynebacterium diptheriae, Corynebacterium efficiens, Corynebacterium glutamicum, Streptomyces avermitilis, and Streptomyces coelicolor. These bacterial proteins have 26%29% sequence similarity to AF2095, with pG scores ranging from 0.81 to 0.98 for the homology models. Our analysis of sequence and structure homologs of AF2095 has clearly indicated that Pth2 enzymes have a wider phylogenetic distribution than previously thought. Recently, the Pfam family UPF0099 has been updated to include these bacterial proteins.
Phylogenic analysis and evolution of Pth and Pth2 enzymes
The complete lack of sequence and structural similarity between the Pth and Pth2 enzyme families, while maintaining comparable hydrolase activity, is suggestive of convergent evolution. Our structural and functional analysis of AF2095 allows us to propose a potential evolutionary pathway for the Pth and Pth2 enzymes. It appears that eukaryotes inherited Pth2 enzymes from the archaea lineage during the formation of the mitochondria from the endosymbiosis of two prokaryotes (Brown and Doolittle 1997). Similarly, bacteria appear to have inherited Pth2 from archaea by lateral gene transfer (Hao and Golding 2004) after the emergence of the three separate domains. A phylogenic tree (Fig. 4
) based on sequence similarity of at least <30% with AF2095 supports this proposed Pth2 evolutionary pathway.
|
The next closely related branch in the AF2095 phylogenic tree (colored red) contains no archaea Pth2 sequences, but it contains all the bacteria and virus genomes with an identified Pth2 enzyme. Again, this would be consistent with lateral gene transfer of Pth2 to bacteria from archaea after the separation of these domains. All the bacteria genomes that were identified to contain a Pth2 enzyme also contain Pth enzymes. Since Pth activity is crucial for bacteria viability, this observation would be consistent with bacteria inheriting the redundant Pth2 gene from archaea as opposed to the simultaneous evolution within the same organism of functionally redundant enzymes that are sequence and structurally distinct. This analysis would also imply that the Pth gene evolved separately through the bacteria lineage and was inherited by eukaryotes when eukaryotes and eubacteria diverged (Rosas-Sandoval et al. 2002). Again, this is consistent with eukaryotes containing both Pth and Pth2 enzymes, where Pth2 is a mitochondrial protein. The two remaining branches are almost exclusively either archaea or eukaryotic proteins, where the eukaryotes are primarily higher organismssuch as human, mouse, rat, and flyrepresenting distant evolution of the Pth2 enzyme.
Conclusion
The NMR solution structure for AF2095 has provided insight into the function of AF2095. The sequence and structure similarity between AF2095 and human Pth2 suggests that AF2095 is a peptidyl-tRNA hydrolase, specifically a Pth2 enzyme. Structural comparison between the Pth2 structures and the prediction that the hydrolase contains a catalytic triad composed of a nucleophile, a basic, and an acidic amino acid (Sanishvili et al. 2003; de Pereda et al. 2004) enables the prediction of residues in AF2095 that may comprise a catalytic site, specifically K19 (located at the beginning of
1), D80, and T90 (located at the
3
4 loop), which are conserved in AF2095 homologs. These residues are proximal to the proposed tRNA interaction region, which has a strong electrostatic positive field. However, the distances between these putative catalytic residues in AF2095 would likely preclude them from forming a triad in the conformation observed in the AF2095 structure. Thus, the proper assembly of the active site may first require binding the peptidyl-tRNA as previously observed with aminoacyl-tRNA synthetase. The properties of the electrostatic surface of AF2095 are consistent with a nucleic acid binding function as well as with the previous analysis of Pth2 structures. The AF2095 NMR structure provides structural and functional leverage for 55 protein sequences from a variety of organisms, including five bacterial proteins. The AF2095 leverage analysis implies that Pth2 is an extensive family present in archaea, bacteria, and eukaryotes, which is contrary to prior predictions but consistent with the recent addition of bacterial proteins to Pfamfamily UPF0099. Phylogenic analysis of the Pth2 enzymes homologous to AF2095 supports convergent evolution of the Pth and Pth2 enzymes, suggesting that eukaryotes inherited Pth2 as part of the mitochondria organelle.
| Materials and methods |
|---|
|
|
|---|
1H, 15N, 13C, and 13CO assignments and secondary structure determination of AF2095 were reported previously (Powers et al. 2004). In addition to the NMR experiments used for the AF2095 resonance assignments, the present structure is based on the following series of spectra: HNHA (Vuister and Bax 1993), 1H-15N HSQC, and 3D 15N- (Marion et al. 1989; Zuiderweg and Fesik 1989) and 13C-edited NOESY (Ikura et al. 1990; Zuiderweg et al. 1990) experiments. The 15N-edited NOESY and 13C-edited NOESY experiments were collected with 100-msec and 80-msec mixing times, respectively. Spectra were processed by using the NMRPipe software package (Delaglio et al. 1995) and analyzed with PIPP (Garrett et al. 1991) on a Linux workstation.
Interproton distance restraints
The NOEs assigned from 3D 13C-edited NOESY and 3D 15N-edited NOESY experiments were classified into strong, medium, weak, and very weak corresponding to interproton distance restraints of 1.82.7 Å (1.82.9 Å for NOEs involving NH protons), 1.83.3 Å (1.83.5 Å for NOEs involving NH protons), 1.85.0 Å, and 3.06.0 Å, respectively (Williamson et al. 1985; Clore et al. 1986). Upper distance limits for distances involving methyl protons and nonstereospecifically assigned methylene protons were corrected appropriately for center averaging (Wüthrich et al. 1983). Hydrogen bond restraints were deduced on the basis of slowly exchanging NH protons, which were identified by recording a 1H-15N-HSQC spectrum after exchanging an AF2095 sample from H2O to D2O. Two distance restraints were used for each hydrogen bond (rNH-O = 1.52.3 Å, rN-O = 2.43.3 Å).
Torsion angle restraints
The
and
torsion angle restraints were obtained from chemical shift analysis by using the TALOS program (Cornilescu et al. 1999) and from consistency with distance restraints for intraresidue and sequential NOEs involving NH, C
H, and C
H protons, and 3J(HN-H
) coupling constants measured from the relative intensity of H
cross-peaks to the HN diagonal in the HNHA experiment (Vuister and Bax 1993). The minimum ranges employed for the
, and
torsion angle restraints were ±30° and ±50°, respectively.
Structure calculations
An initial structural fold for AF2095 was determined by automated analysis of NMR data using the programs AutoStructure (Huang 2001; Huang et al. 2003; Zheng et al. 2003; Huang et al. 2005) and DYANA (Guntert et al. 1997) along with slow amide exchange and 3J(HN-H
). The best DYANA structure was then used in further refinement by using XPLOR-NIH software. The structures were further refined using the hybrid distance geometry dynamical-simulated annealing method of Nilges et al. (1988c) with minor modifications (Clore et al. 1990), using the program XPLOR-NIH (Schwieters et al. 2003), adapted to incorporate pseudopotentials for 3J(HN-H
) coupling constants (Garrett et al. 1994), secondary 13C
/13C
chemical shift restraints (Kuszewski et al. 1995), radius of gyration (Kuszewski et al. 1999), and a conformational database potential (Kuszewski et al. 1996, 1997; Kuszewski and Clore 2000). The target function that is minimized during restrained minimization and simulated annealing comprises only quadratic harmonic terms for covalent geometry, 3J(HN-H
) coupling constants, and secondary 13C
/13C
chemical shift restraints, square-well quadratic potentials for the experimental distance, radius of gyration and torsion angle restraints, and a quartic van der Waals term for nonbonded contacts. The radius of gyration can be predicted with reasonable accuracy on the basis of the number of residues using a relationship determined empirically from the analysis of high-resolution X-ray structures (Kuszewski et al. 1999). The force constant for the conformational database and radius of gyration potentials were kept relatively low throughout the simulation to allow the experimental distance and torsion angle restraints to predominately influence the resulting structures. The force constant for the NOE and dihedral restraints was 30 times and 10 times stronger then was the force constants used for the conformational database and radius of gyration potentials, respectively. All peptide bonds were constrained to be planar and trans. There were no hydrogen-bonding, electrostatic, or 612 Lennard-Jones empirical potential energy terms in the target function.
Leverage analysis
Determination of the number of protein sequences for which quality homology models can be built with the structure of AF2095 as a modeling template was performed in several automated steps. First, a PSI-BLAST (Altschul et al. 1997) search was performed against the National Center for Biotechnology Information (NCBI) NR protein database with AF2095 as the query sequence. The threshold for inclusion of a hit in the PSI-BLAST position-specific scoring matrix (or PSSM) was 0.0005, and 10 rounds of iteration were allowed. Significant hits were defined as those hits with e-values in the final round of <0.001. Next, the sequences of proteins of known structure and redundant sequences from the same species were removed from the hit list. For the purpose of our analysis, two sequences from the same species were considered redundant if their regions aligned to AF2095 are of the same length and >95% identical to each other. The remaining significant hits comprise our "homolog list."
Protein structure models were built for all sequences in the homolog list as well as for nonsignificant PSI-BLAST hits (i.e., e-value > 0.001) if these had >50 residues aligned to AF2095. Including nonsignificant hits allowed us to test for the existence of possible structural homologs whose sequences have diverged from AF2095 to an extent too great to be detected with confidence by PSI-BLAST. Ten models were calculated for each PSI-BLAST alignment using comparative modeling, as implemented in program MODELLER (Sali and Blundell 1993). Subsequently, each models quality was assessed by ProsaII (Sippl 1993). ProsaII Z-scores describe the difference in free energy of a model being evaluated and the average free energy of an ensemble of models obtained by threading the same sequence through unrelated folds. For each set of 10 alignments, only the model with the most significant (lowest) ProsaII Z-score was retained. The probability that a model is reliable, given its Z-score value and length, was calculated by the pG server (Sanchez and Sali 1998b; http://sanchezlab.org/servers/pg). pG values provide a way to compare statistical significance among models of different lengths. Based on past analyses (Sanchez and Sali 1998b), models with pG scores between 0.7 and 1 are considered reliable and, thereby, likely to have the same fold as AF2095.
Models, generated as described above, that were assessed as unreliable (i.e., pG < 0.7) were considered manually. Generally, it was possible to derive significantly improved models by manually editing the sequence alignment between homolog and template or by using alternative templates, i.e., 1rlk or 1q7s, that are structurally homologous to AF2095.
Comparison of AF2095 with homologs of known structure
AF2095 structure (PDB ID 1rzw
[PDB]
) and the structures of its homologs, hypothetical protein TA0108 from T. acidophilu (PDB ID 1rlk
[PDB]
) and human Pth2, also called Bcl-2 inhibitor of transcription 1 (Bit1) (PDB ID 1q7s
[PDB]
; de Pereda et al. 2004), were superimposed by MODELLER and PRISM (Yang and Honig 1999). Two residues were considered structurally equivalent if their C
atoms are <3.5 Å apart upon rigid body superposition. The sets of homologs for 1rlk and 1q7s were obtained as described for AF2095 above.
Pth2 family multiple sequence alignment
The multiple sequence alignment for AF2095 and its homologs incorporated those sequence fragments aligned to AF2095 by PSI-BLAST. The alignment was performed by ClustalW (Thompson et al. 1994) and refined manually by matching secondary structure information about AF2095 and its homologs of known structure (TA0108 and human Pth2) to secondary structure predictions (Kelley et al. 2000) for the homologs of unknown structure. Similarly, ClustalW was used to generate the phylogenic tree based on sequence alignment to AF2095.
Electrostatic calculations
Electrostatic potentials of structures and models were calculated and visualized in GRASP (Nicholls et al. 1991). In all calculations, the monovalent salt concentration in the continuum solvent was set to 0.1 M. The surface-mapped potential is graded from 5 kT/e (red) to 5 kT/e (blue).
| Electronic supplemental material |
|---|
|
|
|---|
| Footnotes |
|---|
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. 1997. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 25: 33893402.
Brooks, B.R., Bruccoleri, R.E., Olafson, B.D., States, D.J., Swaminathan, S., and Karplus, M. 1983. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem. 4: 187217.
Brown, J.R. and Doolittle, W.F. 1997. Archaea and the prokaryote-to-eukaryote transition. Microbiol. Mol. Biol. Rev. 61: 456502.[Abstract]
Brun, G., Paulin, D., Yot, P., and Chapeville, F. 1971. Peptidyl-tRNA hydrolase: Characterization in some organisms. Enzymic activity in the presence of ribosomes. Biochimie 53: 225231.[Medline]
Burkhard, P., Tai, C.-H., Ristroph, C.M., Cook, P.F., and Jansonius, J.N. 1999. Ligand binding induces a large conformational change in O-acetylserine sulfhydrylase from Salmonella typhimurium. J. Mol. Biol. 291: 941953.[CrossRef][Medline]
Clore, G.M., Nilges, M., Sukumaran, D.K., Brünger, A.T., Karplus, M., and Gronenborn, A.M. 1986. The three-dimensional structure of
1- purothionin in solution: Combined use of nuclear magnetic resonance, distance geometry and restrained molecular dynamics. EMBO J. 5: 27292735.[Medline]
Clore, G.M., Appella, E., Yamada, M., Matsushima, K., and Gronenborn, A.M. 1990. Three-dimensional structure of interleukin 8 in solution. Biochemistry 29: 16891696.[CrossRef][Medline]
Cornilescu, G., Delaglio, F., and Bax, A. 1999. Protein backbone angle restraints from searching a database for chemical shift and sequence homology. J. Biomol. NMR 13: 289302.[CrossRef][Medline]
Crepin, T., Schmitt, E., Mechulam, Y., Sampson, P.B., Vaughan, M.D., Honek, J.F., and Blanquet, S. 2003. Use of analogues of methionine and methionyl adenylate to sample conformational changes during catalysis in Escherichia coli methionyl-tRNA synthetase. J. Mol. Biol. 332: 5972.[CrossRef][Medline]
Delaglio, F., Grzesiek, S., Vuister, G.W., Zhu, G., Pfeifer, J., and Bax, A. 1995. NMRPipe: A multidimensional spectral processing system based on UNIX pipes. J. Biomol. NMR 6: 277293.[Medline]
de Pereda, J.M., Waas, W.F., Jan, Y., Ruoslahti, E., Schimmel, P., and Pascual, J. 2004. Crystal structure of a human peptidyl-tRNA hydrolase reveals a new fold and suggests basis for a bifunctional activity. J. Biol. Chem. 279: 81118115.
Dutka, S., Meinnel, T., Lazennec, C., Mechulam, Y., and Blanquet, S. 1993. Role of the 172 base pair in tRNAs for the activity of Escherichia coli peptidyl-tRNA hydrolase. Nucleic Acids Res. 21: 40254030.
Field, J., Rosenthal, B., and Samuelson, J. 2000. Early lateral transfer of genes encoding malic enzyme, acetyl-CoA synthetase and alcohol dehydrogenases from anaerobic prokaryotes to Entamoeba histolytica. Mol. Microbiol. 38: 446455.[CrossRef][Medline]
Francklyn, C., Perona, J.J., Puetz, J., and Hou, Y.-M. 2002. Aminoacyl-tRNA synthetases: Versatile players in the changing theater of translation. RNA 8: 13631372.[Abstract]
Fromant, M., Ferri-Fioni, M.-L., Plateau, P., and Blanquet, S. 2003. Peptidyl-tRNA hydrolase from Sulfolobus solfataricus. Nucleic Acids Res. 31: 32273235.
Garrett, D.S., Powers, R., Gronenborn, A.M., and Clore, G.M. 1991. A common sense approach to peak picking in two-, three-, and four-dimensional spectra using automatic computer analysis of contour diagrams. J. Magn. Reson. 95: 214220.
Garrett, D.S., Kuszewski, J., Hancock, T.J., Lodi, P.J., Vuister, G.W., Gronenborn, A.M., and Clore, G.M. 1994. The impact of direct refinement against three-bond HN-C
H coupling constants on protein structure determination by NMR. J. Magn. Reson. B 104: 99103.[CrossRef][Medline]
Glaser, F., Pupko, T., Paz, I., Bell, R.E., Bechor-Shental, D., Martz, E., and Ben-Tal, N. 2003. ConSurf: Identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics 19: 163164.
Goodall, J.J., Chen, G.J., and Page, M.G.P. 2004. Essential role of histidine 20 in the catalytic mechanism of Escherichia coli peptidyl-tRNA hydrolase. Biochemistry 43: 45834591.[CrossRef][Medline]
Guntert, P., Mumenthaler, C., and Wüthrich, K. 1997. Torsion angle dynamics for NMR structure calculation with the new program DYANA. J. Mol. Biol. 273: 283298.[CrossRef][Medline]
Hao, W. and Golding, G.B. 2004. Patterns of bacterial gene movement. Mol. Biol. Evol. 21: 12941307.
Heurgue-Hamard, V., Mora, L., Guarneros, G., and Buckingham, R.H. 1996. The growth defect in Escherichia coli deficient in peptidyl-tRNA hydrolase is due to starvation for Lys-tRNALys. EMBO J. 15: 2826 2833.[Medline]
Huang, Y.J. 2001. "Automated determination of protein structures from NMR data by iterative analysis of self-consistent contact patterns." Ph.D. thesis, Rutgers University, New Brunswick, NJ.
Huang, Y.J., Swapna, G.V.T., Rajan, P.K., Ke, H., Xia, B., Shukla, K., Inouye, M., and Montelione, G.T. 2003. Solution NMR structure of ribosome-binding factor A (RbfA), a cold-shock adaptation protein from Escherichia coli. J. Mol. Biol. 327: 521536.[CrossRef][Medline]
Huang, Y.J., Moseley, H.N., Baran, M.C., Arrowsmith, C., Powers, R., Tejero, R., Szyperski, T., and Montelione, G.T. 2005. An integrated platform for automated analysis of protein NMR structures. Methods Enzymol. 394: 111141.[CrossRef][Medline]
Hynson, R.M.G., Kelly, S.M., Price, N.C., and Ramsay, R.R. 2004. Conformational changes in monoamine oxidase A in response to ligand binding or reduction. Biochim. Biophys. Acta 1672: 6066.[Medline]
Ikura, M., Kay, L.E., Tschudin, R., and Bax, A. 1990. Three-dimensional NOESY-HMQC spectroscopy of a carbon-13labeled protein. J. Magn. Reson 86: 204209.
Jan, Y., Matter, M., Pai, J.-t., Chen, Y.-L., Pilch, J., Komatsu, M., Ong, E., Fukuda, M., and Ruoslahti, E. 2004. A mitochondrial protein, Bit1, mediates apoptosis regulated by integrins and Groucho/TLE corepressors. Cell 116: 751762.[CrossRef][Medline]
Kelley, L.A., MacCallum, R.M., and Sternberg, M.J.E. 2000. Enhanced genome annotation using structural profiles in the program 3D-PSSM. J. Mol. Biol. 299: 499520.[Medline]
Koessel, H. 1970. Purification and properties of peptidyl-tRNA hydrolase from Escherichia coli. Biochim. Biophys. Acta 204: 191202.[Medline]
Kraulis, P.J. 1991. MOLSCRIPT: A program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24: 945949.[CrossRef]
Kuszewski, J. and Clore, G.M. 2000. Sources of and solutions to problems in the refinement of protein NMR structures against torsion angle potentials of mean force. J. Magn. Reson. 146: 249254.[CrossRef][Medline]
Kuszewski, J., Qin, J., Gronenborn, A.M., and Clore, G.M. 1995. The impact of direct refinement against 13C
and 13C
chemical shifts on protein structure determination by NMR. J. Magn. Reson. B 106: 9296.[CrossRef][Medline]
Kuszewski, J., Gronenborn, A.M., and Clore, G.M. 1996. Improving the quality of NMR and crystallographic protein structures by means of a conformational database potential derived from structure databases. Protein Sci. 5: 10671080.[Abstract]
. 1997. Improvements and extensions in the conformational database potential for the refinement of NMR and x-ray structures of proteins and nucleic acids. J. Magn. Reson. 125: 171177.[CrossRef][Medline]
. 1999. Improving the packing and accuracy of NMR structures with a pseudopotential for the radius of gyration. J. Am. Chem. Soc. 121: 23372338.[CrossRef]
Laskowski, R.A., Rullmann, J.A., MacArthur, M.W., Kaptein, R., and Thornton, J.M. 1996. AQUA and PROCHECK-NMR: Programs for checking the quality of protein structures solved by NMR. Biomol. NMR 8: 477486.
Liu, J., Hegyi, H., Acton, T.B., Montelione, G.T., and Rost, B. 2004. Automatic target selection for structural genomics on eukaryotes. Proteins 56: 188200.[CrossRef][Medline]
Lovell, S.C., Davis, I.W., Arendall III, W.B., de Bakker, P.I.W., Word, J.M., Prisant, M.G., Richardson, J.S., and Richardson, D.C. 2003. Structure validation by C
geometry:
,
and C
deviation. Proteins 50: 437450.[CrossRef][Medline]
Marion, D., Driscoll, P.C., Kay, L.E., Wingfield, P.T., Bax, A., Gronenborn, A.M., and Clore, G.M. 1989. Overcoming the overlap problem in the assignment of proton NMR spectra of larger proteins by use of three-dimensional heteronuclear proton-nitrogen-15 Hartmann-Hahn-multiple quantum coherence and nuclear Overhauser-multiple quantum coherence spectroscopy: Application to interleukin 1
. Biochemistry 28: 61506156.[CrossRef][Medline]
Meinnel, T., Mechulam, Y., and Blanquet, S. 1993. Methionine as translation start signal: A review of the enzymes of the pathway in Escherichia coli. Biochimie 75: 10611075.[Medline]
Menez, J., Buckingham Richard, H., de Zamaroczy, M., and Campelli Celine, K. 2002. Peptidyl-tRNA hydrolase in Bacillus subtilis, encoded by spoVC, is essential to vegetative growth, whereas the homologous enzyme in Saccharomyces cerevisiae is dispensable. Mol. Microbiol. 45: 123129.[CrossRef][Medline]
Menninger, J.R. 1979. Studies on the metabolic role of peptidyl-tRNA hydrolase, 5: Accumulation of peptidyl tRNA is lethal to Escherichia coli. J. Bacteriol. 137: 694696.
Merritt, E.A. and Bacon, D.J. 1997. Raster3D: Photorealistic molecular graphics. Methods Enzymol. 277: 505524.[Medline]
Moy, F.J., Pisano, M.R., Chanda, P.K., Urbano, C., Killar, L.M., Sung, M.-L., and Powers, R. 1997. Assignments, secondary structure and dynamics of the inhibitor-free catalytic fragment of human fibroblast collagenase. J. Biomol. NMR 10: 919.[CrossRef][Medline]
Moy, F.J., Chanda, P.K., Cosmi, S., Pisano, M.R., Urbano, C., Wilhelm, J., and Powers, R. 1998. High-resolution solution structure of the inhibitor-free catalytic fragment of human fibroblast collagenase determined by multidimensional NMR. Biochemistry 37: 14951504.[CrossRef][Medline]
Nicholls, A., Sharp, K., and Honig, B. 1991. Protein folding and association: Insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins 11: 281296.[CrossRef][Medline]
Nilges, M., Clore, G.M., and Gronenborn, A.M. 1988a. Determination of three-dimensional structures of proteins from i