|
|
||||||||
1 Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby 2800, Denmark
2 Stockholm Bioinformatics Center, Department of Biochemistry, Stockholm University, Stockholm S-106 91, Sweden
Reprint requests to: Anders Krogh, The Bioinformatics Centre, University of Copenhagen, Universitetsparken 15, 2100 Copenhagen, Denmark; e-mail: krogh{at}binf.ku.dk; fax: 45-3532-1300.
(RECEIVED January 24, 2003; FINAL REVISION May 15, 2003; ACCEPTED May 19, 2003)
3 These authors contributed equally to the presented work. ![]()
4 Present address: The Bioinformatics Centre, University of Copenhagen, Universitetsparken 15, 2100 Copenhagen, Denmark ![]()
Article and publication are at http://www.proteinscience.org/cgi/doi/10.1110/ps.0303703.
| Abstract |
|---|
|
|
|---|
Keywords: Signal peptides; lipoprotein prediction; HMM; neural networks
| Introduction |
|---|
|
|
|---|
Biosynthesis of lipoproteins in Gram-negative and Gram-positive bacteria consists of three steps, as shown in Figure 1
: transfer of a diacylglyceride to the cysteine sulphydryl group of the unmodified prolipoprotein; cleavage of the signal peptide by signal peptidase II, forming an apolipoprotein; and, finally, acylation of the
-amino group of the N-terminal cysteine of the apolipoprotein (Sankaran and Wu 1994). Before the processing of the prolipoprotein, which takes place on the periplasmic side of the inner membrane, the prolipoprotein is exported through the inner membrane by the general secretory pathway that is also used by secretory proteins processed by SPaseI (Hayashi and Wu 1990). In Gram-negative bacteria, the lipoproteins are anchored to either the inner or the outer membrane, and a single amino acid in position +2 is proposed to determine the final destination of the lipoproteins (Yamaguchi et al. 1988; Seydel et al. 1999). For more details about biosynthesis and export of lipoproteins, see Braun and Wu (1994).
|
C (von Heijne 1989). The consensus for the lipoprotein signal sequence has previously been characterized further, so it could be used for lipoprotein predictions. One example is the consensus made by von Heijne, (LVI)(ASTG)(GA)
C, requiring only one match to the first two positions. This pattern was able to discriminate between all lipoprotein signal peptides and SPaseI-cleaved signal peptides known at the time (von Heijne 1989). The lipoprotein predictor in PSORT (Nakai and Kanehisa 1991) integrates the von Heijne consensus sequence in its predictions. Another example is the Prosite pattern PS00013 {DERK} (6)(LIVMFWSTAG)(2)(LIVMFYSTAGCQ) (AGS)
C, where {DERK}(6) means that none of the four amino acids are allowed in the first six positions (position -10 to -5 relative to the cleavage site). The pattern has two additional rules: The cysteine must be between position 15 and 35, and at least one lysine or arginine must be in one of the first seven positions of the signal peptide (Falquet et al. 2002). More recently, a new regular expression was made for Gram-positive bacteria (Sutcliffe and Harrington 2002). The lipoprotein signal peptide has been compared with the SPaseI-cleaved signal peptides. The lipoprotein signal peptides have a similar n-region, but the h-regions of lipoprotein signal peptides are shorter and the SPaseI-cleaved signal peptides have a polar c-region before the cleavage site (Klein et al. 1988; von Heijne 1989). For lipoproteins, as well as for the SpaseI-cleaved proteins, the n- and h-regions are required for the translocation of the uncleaved protein precursor through the inner membrane. The c-region is necessary for the recognition of the cleavage site by the signal peptidase (von Heijne 1990).
Methods for prediction of SPaseI-cleaved signal peptides have been around for some time (Nakai and Kanehisa 1991; Nielsen et al. 1997). The performance of these methods is generally quite good, but it is a problem to discriminate SPaseI-cleaved signal peptides from SPaseII-cleaved signals and N-terminal transmembrane helices (Nielsen et al. 1997; Nielsen and Krogh 1998). Similarly, methods for predicting transmembrane helices often, by mistake, predict signal peptides as membrane helices (for example, see Krogh et al. 2001).
Here we present a method to predict lipoproteins in Gram-negative bacteria and their signal peptide cleavage site based on a hidden Markov model (HMM) or a neural network. Both methods are significantly better than the above-mentioned existing methods. The HMM is trained on both SPaseI-cleaved proteins, lipoproteins, and cytoplasmic and transmembrane proteins, and it is able to classify an N-terminal protein sequence as a lipoprotein signal peptides, a SPaseI-cleaved signal peptide, or a protein without a signal sequence (cytoplasmic or transmembrane) with very low error rates. The HMM is also able to predict the cleavage site in both SPaseI- and SPaseII-cleaved signal peptides.
| Results |
|---|
|
|
|---|
Analysis of signal peptides
The length distributions of the two kinds of signal peptides are shown in Figure 2
. The mean length of lipoprotein signal peptide is found to be 19.3, and for the SPaseI-cleaved signal peptide, it is 24.9. Sequence logos (Schneider and Stephens 1990) for the regions close to the cleavage sites (Fig. 3A,B
) show that the cleavage site consensus differs in amino acid distribution, which corresponds well with the fact that the signal peptides are cleaved by different proteolytic enzymes. The lipoproteins must have a cysteine after the cleavage site, whereas the SPaseI-cleaved signal peptide can have several different amino acids in the first position after the cleavage site. The signal parts to the left of the cleavage site differ as well. Figure 3, A and B
, indicates that the hydrophobic region is closer to the cleavage site for the lipoproteins than for the SPaseI-cleaved signal peptides. The SPaseI-cleaved signal peptides have a polar region right before the cleavage site (mostly serine). Figure 3, C and D
, shows the sequence logos of the first 30 amino acids for SPaseI- and SPaseII-cleaved proteins. SPaseI and SPaseII signal peptides both have some positive amino acids in the beginning of the sequence followed by a hydrophobic region with a similar amino acid distribution after that, and the similarity between the sequences corresponds well with the fact that all signal peptides are recognized by the same secretory enzymes. Figure 3, C and B
, also shows that the hydrophobic region of the lipoproteins is shorter than the one for secretory proteins, as expected. Cytoplasmic proteins do not have a preference for a particular amino acid in any positions besides the first methionine.
|
|
The neural networks were evaluated by their performance on the test data sets by cross-validation, and the correlation coefficient were calculated (Matthews 1975). Based on the correlation coefficients and the number of lipoproteins predicted, the best network was chosen and the optimal parameters were estimated.
Judging from the correlation coefficients, the best neural network prediction (Fig. 4
) was obtained for neural networks with the window size 29 and two hidden neurons, which was chosen as optimal parameters. For this neural network, there were 61 true positives (96.8% of all positives) and eight false positives (1.1% of all negative). Some of the other neural networks with high correlation coefficients had less false negatives but also less true positives. The network with these optimized parameters was used in all the following analysis. With this neural network, none of the transmembrane proteins were predicted as lipoproteins.
|
|
SPaseI-cleaved signal peptides
The submodel for signal peptides is shown in Figure 6
. It has states modeling the n-region, the h-region, and the c-region. It also models part of the protein after the cleavage site. The signal peptide model is very similar to the one in Nielsen and Krogh (1998), but is simplified a little. Initially, the same model as in Nielsen and Krogh (1998) was used, but after estimation, states and transitions with very small probabilities were eliminated.
|
N-terminal transmembrane helices
N-terminal transmembrane helices are often mistaken as signal peptides, and vice versa (Krogh et al. 2001). Therefore, a submodel for N-terminal transmembrane helices was included. It is essentially a part of the TMHMM model (Krogh et al. 2001), in which just one membrane helix can be modeled. The intention with this part of the model is primarily to limit the number of false positives from the signal peptide predictions and not to predict whether a protein has an N-terminal transmembrane helix or not.
Cytoplasmic proteins
This submodel consists of two states: a state for the first amino acid and a state for the rest with a transition to itself.
Only the first 70 amino acids were used for both training and testing. The first three branches have a six-state submodel for modeling the length distribution and amino acid composition of the last part of the sequence in the mature protein. The whole model was estimated from the data and tested with cross-validation as the neural network (see Methods).
To classify proteins into each of the four classes presented by the submodels, the posterior probabilities of the sequence given each branch were used. These probabilities were divided by the probability of the protein according to a null model. The logarithm of the ratio (the log-odds) was used as the score for each class. We used the submodel for cytoplasmic proteins as the null model, so the score for cytoplasmic is always the same for any protein. To predict the class of a protein, we chose the highest scoring branch. Table 1
shows the number of predictions in each class in the cross-validation versus the correct classification (a "confusion matrix").
|
|
|
| Discussion |
|---|
|
|
|---|
The von Heijne consensus pattern predicted 54 of the 63 lipoproteins correctly, but also gave a total of 74 false positives. Forty-four of these were SpaseI-cleaved signal peptides, which the consensus pattern should be able to distinguish from lipoprotein signal peptides. This is not surprising, because many lipoproteins and SPaseI-cleaved proteins have been annotated since 1989. The Prosite pattern predicted 56 of the 63 lipoproteins correctly and it came up with only 14 false positives, significantly better than the von Heijne consensus pattern. Still, the HMM and the neural network were both significantly better, as the number of false positives was almost twice as much for the Prosite pattern prediction compared with the neural network, and even more when compared with the HMM.
When comparing the two new predictors, the HMM seemed superior. By using the HMM, the same number of lipoproteins were predicted correctly as with the neural network. However, less false positives were predicted with the HMM. Varying the threshold for the neural network, the rate of false positives could be decreased, but at the cost of true positives. Actually, raising the threshold for the output neuron so that the number of false positives was decreased to the same number as for the HMM, the number of true positives was decreased to as little as 51. The worse performance of the neural network could well be due to a much larger number of free parameters compared with the relatively small number of lipoproteins used for training. It is possible that the neural net could have been further improved by, for example, using an asymmetric input window as in SignalP.
Table 2
summarizes the comparison. Because the HMM gave the best results, it was used for further investigations.
|
The amino acid in position +2 relative to the cleavage site is believed to determine whether the protein is attached to the inner or outer membrane of Gram-negative bacteria. Traditionally, it was thought that an aspartic acid in this position directs the protein to the inner membrane, and all other amino acids direct it to the outer membrane. However, it has been shown that the situation is not quite so simple (Seydel et al. 1999). We have not been able to find sufficient experimental data to include this sorting signal into the model, and instead the server provided at www.cbs.dtu.dk/services/LipoP/ simply reports which amino acid is in the +2 position to help users judge for themselves.
Prediction of Gram-positive lipoproteins
Because lipoproteins from Gram-negative and Gram-positive Eubacteria are resembling each other in the consensus sequence close to the lipid attachment site (Sutcliffe and Russell 1995), we also tested the HMM on Gram-positive lipoproteins.
A small data set consisting of 28 lipoproteins from Gram-positive Eubacteria was extracted from SWISS-PROT by using the same criteria as for the Gram-negative lipoproteins. Twenty-six of these lipoproteins were correctly predicted by the HMM to be lipoproteins, whereas the last two proteins were predicted as transmembrane proteins. Because of the limited number of available sequences, the data set was not similarity reduced, and these two sequences not predicted to be lipoproteins by the HMM were actually homologous cytochrome C oxidase polypeptide II precursors from two different Bacillus species, B. firmus and B. subtilis (COX2_PVBACFI and QOX2_PVBACSU). As these are annotated in SWISS-PROT as having several potential transmembrane helices, as well as having a lipoprotein signal peptide, the HMM prediction is actually not wrong. The SPaseII scores were still relatively high (2.92 and 5.70) compared with TMH scores (6.95 and 9.22). It should be noted that several examples of integral membrane proteins with cleavable lipoprotein signal peptide has been shown to exist (Pyrowolakis et al. 1998; Bengtsson et al. 1999; Sakamoto et al. 1999).
After the completion of the above data set, it came to our attention that another set of 33 experimentally verified Gram-positive lipoproteins was used by Sutcliffe and Harrington (2002). Twelve of these sequences are also in our data set. We have tested our method on 31 of the sequences (LppC and MBL from Streptococcus equi were not found). Four sequences were wrongly classified, but three of them had the correct cleavage site predicted in a suboptimal prediction of lipoprotein (see www.binf.ku.dk/krogh/LipoP/). Two cytochrome C oxidases (QOX2_PVBACSU and Q93HZ4) were predicted as transmembrane, one protein was predicted as an ordinary signal peptide (SODC_PVMYCTU), and one as cytosolic (KAPB_PVBACSU).
Genome search
Lipoproteins were predicted in the complete proteomes of 13 microbial genomes from GenBank. Because the above results indicated that the HMM also was capable of predicting Gram-positive lipoproteins, the genome of the industrially very important B. subtilis was included for testing. Table 3
lists the number of proteins predicted as lipoproteins by the HMM model. The number of predicted lipoproteins annotated as such is listed for both GenBank and SWISS-PROT. Many of the proteins included in the whole genome data sets from GenBank cannot be found in SWISS-PROT. Therefore, the number of predicted lipoproteins, which can be found in SWISS-PROT, is included for comparison.
|
|
In the Gram-positive B. subtilis, 101 annotated proteins were predicted as lipoproteins. For comparison, Tjasma et al. (1999) found 114 probable lipoproteins by a SignalP search combined with a lipobox search and a Blast similarity search. Sutcliffe and Harrington (2002) found 67 lipoproteins (61 probable and six proven) lipoproteins by a regular expression called G+LPP, and Gonnet and Lisacek (2002) found 65 lipoproteins predicted by another refined regular expression.
Conclusion
A method for lipoprotein prediction, LipoP, was developed. Both an HMM and a neural network were significantly better at predicting lipoproteins than were any of the existing methods discussed in this article. The HMM method was chosen for the remainder of the analysis, mainly because it distinguishes between lipoproteins, SPaseI-cleaved signal peptides, cytoplasmic proteins, and proteins with N-terminal transmembrane helices. However, when handling proteins, which are both lipoproteins and have transmembrane regions, the HMM, in some cases, misses the lipoprotein signal peptide.
The method was used to predict lipoproteins in 12 Gram-negative bacteria. When comparing a genome search of E. coli with new experimental data, most of the experimentally verified lipoproteins were correctly predicted as lipoproteins (94.6%). This verification of the lipoproteins predicted in E. coli might be an indication of how well the HMM performs on genome data in general. Even though the HMM is trained on proteins from Gram-negative bacteria, it also seems to be able to predict Gram-positive lipoproteins. This feature was used to make a genome search of the Gram-positive bacteria, B. subtilis.
The LipoP server is accessible at www.cbs.dtu.dk/services/LipoP/. Genome predictions and other material are accessible at www.binf.ku.dk/krogh/LipoP/.
| Materials and methods |
|---|
|
|
|---|
Only a very limited number of lipoproteins with known signal length and lipid attachment site for Gram-negative Bacteria could be retrieved. Therefore, also lipoproteins annotated as probable for signal length and lipid attachment site, as well as lipoproteins annotated as potential in only one of these categories, but certain lipoproteins in the other were allowed in the data set. Hereby, we were able to extract 99 lipoproteins. More sequences where available for Gram-negative SPaseI-cleaved proteins and for Gram-negative cytoplasmic proteins; thus all proteins with annotations such as probable and potential were excluded from these data sets, creating two parts of the data set consisting of 528 SPaseI-cleaved proteins and 1026 cytoplasmic proteins, respectively. In these sets, the first amino acid after the cleavage site was labeled.
The combined data set was then homology reduced to limit biasing so it could be used for testing with cross-validation. Because we were primarily interested in the signal part of the sequence, only the first 30 amino acids were taken into consideration for the lipoproteins and the first 60 amino acids for the SPaseI-cleaved proteins and the cytoplasmic proteins in the similarity reduction. To generate a nonredundant data set, we searched each sequence in the data set against all the other sequences by using BLASTP (Altschul et al. 1997) and a Blosum62 score matrix (Henikoff and Henikoff 1992). By using a threshold of 10-6 on the expectation score, we subsequently generated a maximal nonredundant version of the data set using the Hobohm-2 algorithm (Hobohm et al. 1992). Finally, the data set consisted of 63 nonhomologous lipoproteins (Table 5
), 328 SPaseI-cleaved proteins, and 388 cytoplasmic proteins.
|
For testing the methods, the data were divided into 63 sets. Each set contained exactly one lipoprotein. The other sets were distributed equally and randomly among the 63 sets. In the cross-validation procedure, the HMM or neural network was trained on 62 of these sets and tested on the one that was left out. This was repeated 63 times, so that all sets were used for testing once, and finally, the test results were averaged. This is a standard method for obtaining unbiased test results when the amount of data is limited.
A data set consisting of Gram-positive lipoproteins was made for testing purposes only. The extraction of proteins was done in the same way as the data set for Gram-negative lipoproteins, but the data set was not homology-reduced because the Gram-positive lipoproteins would only be used for testing.
Recently, 90 lipoproteins from E. coli have been experimentally verified by S. Matsuyama et al. (unpubl.). These were used as a base for comparison of the results from the genome search carried out for E. coli.
Neural networks
The neural network training was carried out by using the lipoprotein cleavage site on lipoproteins as positive examples and all remaining "C"s from lipoproteins, SpaseI-cleaved proteins and cytoplasmic proteins as negative examples. Thus, the neural networks were trained only on cysteines, and backward propagation was used under the training. The number of hidden neurons was varied from zero to four, and the size of the symmetric windows was varied from 27 to 33. The neural networks were evaluated by their performance on the test data sets. The test data from the 63 cross-validations were added together, and the correlation coefficient were calculated (Matthews 1975). In this way, all proteins in the entire data set were included in the calculation, and none of them were tested on the network they were trained on. By considering the correlation coefficient and the number of lipoproteins predicted, the best network was chosen and the optimal parameters were estimated.
The training set for the neural networks consisted of Gram-negative lipoproteins and SPaseI and cytoplasmic proteins data sets, whereas the transmembrane data set was used only for testing. The neural network was trained on the first 100 amino acids of each sequence. For testing, only the first 50 amino acids of each sequence were considered.
HMMs
The four branches or submodels already described were denoted SPaseI, SPaseII, TMH, and CYT. The first state of each branch was given probability 1 out of 20 for all amino acids. This is because the first amino acid always is methionine, so there is no information in this amino acid. The advantage of this scheme is that the model can deal with a wrongly assigned first amino acid (which happens sometimes when the start codon of a gene is not ATG).
The probability for entering each of the branches was not estimated from the data. These entry probabilities reflect the prior probabilities (in a Bayesian statistical sense) that a randomly chosen protein belongs to each of the four classes. They also determine the number of predictions of each class, so by changing them, one can, for instance, increase the number of predictions from a certain class. They were set by trial and error so as to get reasonable prediction levels for the classes, but mostly focused on the performance on lipoproteins. Equivalently, one could fix the four entry probabilities to, for example, one out of four and then instead of choosing the highest scoring branch for prediction, one could have class-specific cut-offs on the log-odds score. In our final model, we have these probabilities: P(SPaseI) = 0.08, P(SPaseII) = 0.02, P(TMH)=0.03, and P(CYT) = 0.87. The system is not very sensitive to these parameters.
The model was trained using the Baum-Welch procedure for labeled sequences (Krogh 1997; Durbin et al. 1998; Krogh and Riis 1999). The sequences were labeled according to which of the three classes it belonged to, and the cleavage site was labeled for signal sequences. This ensures that a submodel is trained on the correct set of proteins, and that cleavage sites are correctly positioned during training. Only the first 70 amino acids of each protein were used for training and testing.
We used the submodel for cytoplasmic proteins as the null model, so the score for cytoplasmic is always equal to log[P(CYT)] = -0.1393 (the natural logarithm is used).
Genome search
Data sets were extracted from the GenBank genomic library (Benson et al. 2002). The extracted genomes and the corresponding GenBank files are listed in Table 6
. The number of sequences included in each genome file is listed as well.
|
| Acknowledgments |
|---|
The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 USC section 1734 solely to indicate this fact.
| References |
|---|
|
|
|---|
Bairoch, A. and Apweiler, R. 2000. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28: 4548.
Bengtsson, J., Tjalsma, H., Rivolta, C., and Hederstedt, L. 1999. Subunit II of Bacillus subtilis cytochrome c oxidase is a lipoprotein. J. Bacteriol. 181: 685688.
Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Rapp, B.A., and Wheeler, D.L. 2002. GenBank. Nucleic Acids Res. 30: 1720.
Braun, V. and Wu, H.C. 1994. Lipoproteins, structure, function, biosynthesis, and a model for protein export. In Bacterial cell wall (eds. J.M. Ghuysen and R. Hakenbeck), pp. 319341. Elsevier, Amsterdam, The Netherlands.
Durbin, R.M., Eddy, S.R., Krogh, A., and Mitchison, G. 1998. Biological sequence analysis. Cambridge University Press, Cambridge, UK.
Falquet, L., Pagni, M., Bucher, P., Hulo, N., Sigrist, C.J., Hofmann, K., and Bairoch, A. 2002. The PROSITE database: Its status in 2002. Nucleic Acids Res. 30: 235238.
Gonnet, P. and Lisacek, F. 2002. Probabilistic alignment of motifs with sequences. Bioinformatics 18: 10911101.
Hayashi, S. and Wu, H.C. 1990. Lipoproteins in bacteria. J. Bioenerg. Biomembr. 22: 451471.[CrossRef][Medline]
Henikoff, S. and Henikoff, J.G. 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. 89: 1091510919.
Hobohm, U., Scharf, M., Schneider, R., and Sander, C. 1992. Selection of representative protein data sets. Protein Sci. 1: 409417.[Abstract]
Klein, P., Somorjai, R.L., and Lau, P.C. 1988. Distinctive properties of signal sequences from bacterial lipoproteins. Protein Eng. 2: 1520.
Krogh, A. 1997. Two methods for improving performance of a HMM and their application for gene finding. In Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology (eds. T. Gaasterland et al.), pp. 179186. AAAI Press, Menlo Park, CA.
Krogh, A. and Riis, S.K. 1999. Hidden neural networks. Neural Comput. 11: 541563.[Abstract]
Krogh, A., Larsson, B., von Heijne, G., and Sonnhammer, E.L.L. 2001. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J. Mol. Biol. 305: 567580.[CrossRef][Medline]
Matthews, B.W. 1975. Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim. Biophys. Acta 405: 442451.[Medline]
Nakai, K. and Kanehisa, M. 1991. Expert system for predicting protein localization sites in Gram-negative bacteria. Proteins 11: 95110.[CrossRef][Medline]
Nielsen, H. and Krogh, A. 1998. Prediction of signal peptides and signal anchors by a hidden Markov model. In Proceedings of the Sixth International Conference on Intelligent Systems for Molecular Biology (eds. J. Glasgow et al.), pp. 122130. AAAI Press, Menlo Park, CA.
Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. 1997. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10: 16.
Pyrowolakis, G., Hofmann, D., and Herrmann, R. 1998. The subunit b of the F0F1-type ATPase of the bacterium Mycoplasma pneumoniae is a lipoprotein. J. Biol. Chem. 273: 2479224796.
Sakamoto, J., Shibata, T., Mine, T., Miyahara, R., Torigoe, T., Noguchi, S., Matsushita, K., and Sone, N. 2001. Cytochrome c oxidase contains an extra charged amino acid cluster in a new type of respiratory chain in the amino-acid-producing Gram-positive bacterium Corynebacterium glutamicum. Microbiology 147: 28652871.
Sankaran, K. and Wu, H.C. 1994. Lipid modification of bacterial prolipoprotein. Transfer of diacylglyceryl moiety from phosphatidylglycerol. J. Biol. Chem. 269: 1970119706.
Schneider, T.D. and Stephens, R.M. 1990. Sequence logos: A new way to display consensus sequences. Nucleic Acids Res. 18: 60976100.
Seydel, A., Gounon, P., and Pugsley, A.P. 1999. Testing the "+2 rule" for lipoprotein sorting in the Escherichia coli cell envelope with a new genetic selection. Mol. Microbiol. 34: 810821.[CrossRef][Medline]
Sutcliffe, I.C. and Harrington, D.J. 2002. Pattern searches for the identification of putative lipoprotein genes in Gram-positive bacterial genomes. Microbiology 148: 20652077.
Sutcliffe, I.C. and Russell, R.R. 1995. Lipoproteins of Gram-positive bacteria. J. Bacteriol. 177: 11231128.
Tjalsma, H., Kontinen, V.P., Pragai, Z., Wu, H., Meima, R., Venema, G., Bron, S., Sarvas, M., and van Dijl, J.M. 1999. The role of lipoprotein processing by signal peptidase II in the Gram-positive eubacterium Bacillus subtilis: Signal peptidase II is required for the efficient secretion of
-amylase, a non-lipoprotein. J. Biol. Chem. 274: 16981707.
von Heijne, G. 1989. The structure of signal peptides from bacterial lipoproteins. Protein Eng. 2: 531534.
. 1990. The signal peptide. J. Membr. Biol. 115: 195201.[CrossRef][Medline]
Yamaguchi, K., Yu, F., and Inouye, M. 1988. A single amino acid determinant of the membrane localization of lipoproteins in E. coli. Cell 53: 423432.[CrossRef][Medline]
![]()
CiteULike
Connotea
Del.icio.us
Digg
Reddit
Technorati What's this?
This article has been cited by other articles:
![]() |
P.G. Bagos, K.D. Tsirigos, S.K. Plessas, T.D. Liakopoulos, and S.J. Hamodrakas Prediction of signal peptides in archaea Protein Eng. Des. Sel., November 6, 2008; (2008) gzn064v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Schmidtke and N. D. Hanson Role of ampD Homologs in Overproduction of AmpC in Clinical Isolates of Pseudomonas aeruginosa Antimicrob. Agents Chemother., November 1, 2008; 52(11): 3922 - 3927. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. R. Oliveira, M. T. Longhi, Z. M. de Morais, E. C. Romero, R. M. Blanco, K. Kirchgatter, S. A. Vasconcellos, and A. L. T. O. Nascimento Evaluation of Leptospiral Recombinant Antigens MPL17 and MPL21 for Serological Diagnosis of Leptospirosis by Enzyme-Linked Immunosorbent Assays Clin. Vaccine Immunol., November 1, 2008; 15(11): 1715 - 1722. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. M. Wegener, E. A. Welsh, L. E. Thornton, N. Keren, J. M. Jacobs, K. K. Hixson, M. E. Monroe, D. G. Camp II, R. D. Smith, and H. B. Pakrasi High Sensitivity Proteomics Assisted Discovery of a Novel Operon Involved in the Assembly of Photosystem II, a Membrane Protein Complex J. Biol. Chem., October 10, 2008; 283(41): 27829 - 27837. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. C. Ammerman, M. S. Rahman, and A. F. Azad Characterization of Sec-Translocon-Dependent Extracytoplasmic Proteins of Rickettsia typhi J. Bacteriol., September 15, 2008; 190(18): 6234 - 6242. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Park and T. Uehara How Bacteria Consume Their Own Exoskeletons (Turnover and Recycling of Cell Wall Peptidoglycan) Microbiol. Mol. Biol. Rev., June 1, 2008; 72(2): 211 - 227. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Dumetz, E. Duchaud, S. Claverol, N. Orieux, S. Papillon, D. Lapaillerie, and M. Le Henaff Analysis of the Flavobacterium psychrophilum outer-membrane subproteome and identification of new antigenic targets for vaccine by immunomics Microbiology, June 1, 2008; 154(6): 1793 - 1801. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Hughes, C. L. Nolder, A. J. Nowalk, D. R. Clifton, R. R. Howison, V. L. Schmit, R. D. Gilmore Jr., and J. A. Carroll Borrelia burgdorferi Surface-Localized Proteins Expressed during Persistent Murine Infection Are Conserved among Diverse Borrelia spp. Infect. Immun., June 1, 2008; 76(6): 2498 - 2511. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. B. Brinkman, M. A. McGill, J. Pettersson, A. Rogers, P. Matejkova, D. Smajs, G. M. Weinstock, S. J. Norris, and T. Palzkill A Novel Treponema pallidum Antigen, TP0136, Is an Outer Membrane Protein That Binds Human Fibronectin Infect. Immun., May 1, 2008; 76(5): 1848 - 1857. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Veyrier, B. Said-Salim, and M. A. Behr Evolution of the Mycobacterial SigK Regulon J. Bacteriol., March 15, 2008; 190(6): 1891 - 1899. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Iwashita, S. Fujii, Y. Kawamura, T. Okamoto, T. Sawa, T. Masaki, A. Nishizono, S. Higashi, T. Kitamura, F. Tamura, et al. Identification of the Major Antigenic Protein of Helicobacter cinaedi and Its Immunogenicity in Humans with H. cinaedi Infections Clin. Vaccine Immunol., March 1, 2008; 15(3): 513 - 521. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Liebl, C. Winterhalter, W. Baumeister, M. Armbrecht, and M. Valdez Xylanase Attachm |