Complete genome sequence of Paenibacillus sp. strain JDR-2
- Virginia Chow1,
- Guang Nong1,
- Franz J. St. John2,
- John D. Rice1,
- Ellen Dickstein3,
- Olga Chertkov4,
- David Bruce4,
- Chris Detter4,
- Thomas Brettin5,
- James Han6,
- Tanja Woyke6,
- Sam Pitluck6,
- Matt Nolan6,
- Amrita Pati6,
- Joel Martin6,
- Alex Copeland6,
- Miriam L. Land5,
- Lynne Goodwin4,
- Jeffrey B. Jones3,
- Lonnie O. Ingram1,
- Keelnathan T. Shanmugam1 and
- James F. Preston1Email author
© The Author(s) 2012
Published: 19 March 2012
Paenibacillus sp. strain JDR-2, an aggressively xylanolytic bacterium isolated from sweetgum (Liquidambar styraciflua) wood, is able to efficiently depolymerize, assimilate and metabolize 4-O-methylglucuronoxylan, the predominant structural component of hardwood hemicelluloses. A basis for this capability was first supported by the identification of genes and characterization of encoded enzymes and has been further defined by the sequencing and annotation of the complete genome, which we describe. In addition to genes implicated in the utilization of β-1,4-xylan, genes have also been identified for the utilization of other hemicellulosic polysaccharides. The genome of Paenibacillus sp. JDR-2 contains 7,184,930 bp in a single replicon with 6,288 protein-coding and 122 RNA genes. Uniquely prominent are 874 genes encoding proteins involved in carbohydrate transport and metabolism. The prevalence and organization of these genes support a metabolic potential for bioprocessing of hemicellulose fractions derived from lignocellulosic resources.
Keywordsaerobic mesophile Gram-positive Paenibacillus xylanolytic xylan
Paenibacillus sp. strain JDR-2 (Pjdr2) was isolated from wafers cut from live stems of sweet gum (Liquidambar styraciflua) placed in soil in an area populated predominantly by this tree species. The ability of this isolate to grow on 4-O-methylglucuronoxylose (MeGX) as the sole carbon source identified a metabolic potential not previously described. MeGX is released along with fermentable xylose during dilute acid pretreatment of lignocellulosic biomass. Since MeGX may represent 5 to 20% of the hemicellulose components from hardwoods and agricultural residues, this ability was of interest for increasing bioconversion yields of fermentable sugars from these resources [1,2].
Growth rates and yields of Pjdr2 with polymeric 4-O-methylglucuronoxylan (MeGXn) as substrate were much greater than with monosaccharides and oligosaccharides derived from MeGXn. These increases are presumably the result of a cell-associated multimodular GH10 endoxylanase that generates xylobiose, xylotriose, and the aldouronate, 4-O-methylglucuronoxylotriose (MeGX3), for direct assimilation and metabolism . A cluster of genes was cloned and sequenced from Pjdr2 genomic DNA which contained two genes encoding transcriptional regulators, three genes encoding ABC transporters, and three sequential structural genes lacking secretion sequences encoding a GH67 α-glucuronidase, a GH10 endoxylanase catalytic domain and a putative GH43 β-xylosidase. The expression of these genes, as well as a distal gene encoding a secreted cell-associated multimodular GH10 endoxylanase, was coordinately responsive to inducers and repressors, leading to their collective designation as a xylan-utilization regulon . Physiological studies defining the preferential utilization of MeGXn compared to MeGX and MeGX3 support a process in which extracellular depolymerization, assimilation and intracellular metabolism are coupled, allowing the rapid and complete utilization of MeGXn .
Pjdr2 was the first member of this genus to have its genome completely sequenced and made available for detailed analysis. The sequences of genomes of 2 strains of Paenibacillus polymyxa [5,6], “Paenibacillus vortex” , and Paenibacillus sp. Y412MC10 (NCBI NC_013406.1, unpublished results) have since been completed. The incomplete genome sequence Paenibacillus larvae subsp. larvae, the causative agent of American Foulbrood disease of honey bees, has also been analyzed .
Classification and features
The unrooted phylogenetic tree shows Pjdr2 in a branch that includes other Paenibacillus spp. in this comparison, supporting a lineage distinct from other Gram positive endospore-forming bacteria. Pjdr2 groups more closely with Paenibacillus lentimorbus and other Paenibacillus species that are insect pathogens than it does with another group that includes type species Paenibacillus polymyxa. From the standpoint of genome size and imputed metabolic potential based on sequence, it is surprising, based on 16S sequence, that it is not more closely related to Paenibacillus sp. Y412MC10. Despite a close similarity of Paenibacillus JDR-2 to Microbacterium species with respect to membrane fatty acids (see discussion below), it is clear that it is not related to members of the genus Microbacterium on the basis of 16S rRNA sequence.
Classification and general features of Paenibacillus sp. JDR-2 according to the MIGS recommendations .
Species Paenibacillus sp. Strain JDR-2
Glucose, xylose, β-1,4-xylan, β-1,4-1,3-glucan, 4-O-methyl-glucuronoxylose
Sweet Gum stem wood
Sweet Gum stem wood in soil
Sample collection time
180 feet above msl
The fatty acid methyl esters analysis (FAME) of Pjdr2 provided an alternative approach for determination of relatedness to other bacteria. Cultures were grown to exponential phase (24 hrs) on Trypticase soy agars. Bacterial cells were harvested and extracted according to the standard MIDI protocol . FAME analysis was conducted using the Sherlock Microbial Identification System 4.5 . Analyses showed that the predominant fatty acid in Pjdr2 is anteiso-C15:0 (46.93%), which in addition to iso-C16:0 (23.02%) and C16:0 (13.48%), constituted >80% of the fatty acid composition of this strain. Minor fatty acids included iso-C14:0 (3.92%), C14:0 (2.35%), and iso-C15:0 (5.29%).
Strains with a similarity index (SI) value of 0.5 or higher indicate a good library comparison (MIDI 2002). The two strains that most closely match the profile of Pjdr2 are Microbacterium laevaniformans (SI = 0.75) and Cellulobacterium cellulans (SI = 0.51). We have included these two species in our phylogenetic analysis based upon their 16S rRNA sequences (Figure 1). The FAME analysis provided a rapid assignment of the species by comparing the fatty acid profile(s) with 60 strains (42 species) of Bacillus, 2 strains (1 species) of Cellulobacterium, 20 strains (19 species) of Microbacterium and 20 strains (18 species) of Paenibacillus, as well as other aerobic bacteria. Sequence analysis of 16S rRNA provides the acceptable basis for considering phylogenetic relationships. Nevertheless the FAME analysis provides a convenient method with which to confirm the identity of the organism as it is maintained and studied over time.
Growth conditions and DNA isolation
For the preparation of genomic DNA, one of several colonies surrounded by a clear zone was picked from an agar plate (0.1% oat spelt xylan/0.1% yeast extract/Zucker-Hankin medium , and grown in Zucker-Hankin/1% yeast extract at 30°C with shaking at 240 rpm. A culture (8 ml) at 0.6 OD600nm was inoculated into 48 ml of culture media (Zucker-Hankin, 1% yeast extract). The latter was grown to 0.6 OD600nm and cells were collected by centrifugation. High molecular weight DNA was prepared from these cells as per the protocol provided by JGI. Cells were suspended in TE buffer (10 mM Tris-HCl, 1.0 mM EDTA), pH 8.0 and treated with lysozyme to lyse the cell wall. SDS and Proteinase K were added to denature and degrade proteins. NaCl and CTAB were added to facilitate subsequent precipitation. Cell lysates were extracted with phenol and chloroform and the DNA was precipitated by addition of isopropanol. The nucleic acid pellet was washed with 70% ethanol, dissolved in water and then treated with RNase A.
Genome sequencing and assembly
The genome of Pjdr2 was sequenced at the JGI using a combination of 8 kb and 40 kb (fosmid) DNA libraries. In addition to Sanger sequencing, 454 pyrosequencing  was performed to a depth of 20× coverage. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website . Draft assemblies were based on 39,689 total reads. All three libraries provided 5.1× coverage of the genome. The Phred/Phrap/Consed software package  was used for sequence assembly and quality assessment [31–33]. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with Dupfinisher  or transposon bombing of bridging clones (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, custom primer walk, or PCR amplification (Roche Applied Science, Indianapolis, IN). A total of 1,028 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. The completed sequence analysis of Pjdr2 contained 45,057 reads, achieving an average of 5.5-fold sequence coverage per base, with an error rate less than 1 in 100,000. The complete nucleotide sequence of Paenibacillus sp. strain JDR-2 and its annotation can be found online at the IMG (Integrated Microbial Genome) portal of JGI , as well as at the genome resource site of NCBI .
% of Total
Genome size (bp)
DNA coding region (bp)
DNA G+C content (bp)
Number of replicons
Protein coding genes
Genes with function prediction
Protein coding genes with COGs
Protein coding genes with Pfam
Genes in paralog clusters
Protein coding genes coding signal peptides
Genes connected to transporter classification
Insights from genome sequencing
Utilization of lignocellulosics
The nucleotide sequence of a cluster of genes which included the α-glucuronidase gene served as a marker for the sequenced genome. The sequence of this cluster was previously determined in a cosmid clone of the genomic DNA of Pjdr2. The presence of this unique contiguous sequence in a single copy without orthologs or paralogs supported the final genomic sequence as representative of a single genome from a pure culture. This aldouronate-utilization gene cluster, in conjunction with the distal gene encoding a multimodular cell-associated GH10 endoxylanase, constitutes a xylan-utilization regulon as previously defined . The coordinate expression of the genes in this regulon supports a process in which assimilation of the aldouronate, 4-0-methylglucuronoxylotriose, generated by a cell-associated GH10 endoxylanase, is coupled to extracellular depolymerization, facilitating depolymerization, assimilation and metabolism as previously described . The sequencing of the genome of Paenibacillussp. strain JDR-2 has allowed further analysis of its xylan-utilization regulon and the identification of similar regulons involved in the depolymerization and utilization of soluble β-glucans.
Number of genes associated with the general COG functional categories
Translation, ribosomal structure and biogenesis
RNA processing and modification
Replication, recombination and repair
Chromatin structure and dynamics
Cell cycle control, cell division, chromosome partitioning
Signal transduction mechanisms
Cell wall/membrane/envelope biogenesis
Intracellular trafficking, secretion, and vesicular transport
Posttranslational modification, protein turnover, chaperones
Energy production and conversion
Carbohydrate transport and metabolism
Amino acid transport and metabolism
Nucleotide transport and metabolism
Coenzyme transport and metabolism
Lipid transport and metabolism
Inorganic ion transport and metabolism
Secondary metabolites biosynthesis, transport and catabolism
General function prediction only
Not in COGs
We thank the Electron Microscopy and Bio-Imaging laboratory, Interdisciplinary Center for Biotechnology Research, University of Florida for their assistance in preparing the scanning electron micrographs of Strain Pjdr2. We also thank Len Pennacchio, Natalia Ivanova, Roxanne Tapia and Shunsheng Han for their contributions in genome sequencing and annotations of this organism. The work of genomic sequencing was conducted by the U.S. Department of Energy Joint Genome Institute and supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.This work was supported by the funds from the Department of Energy via the Consortium for Plant Biotechnology Research and the Joint Genome Institute (Project ID 4043135).
- Preston JF, Hurlbert JC, Rice JD, Ragunathan A, St. John FJ. Microbial Strategies for the Depolymerization of Glucuronoxylan: Leads to the Biotechnological Applications of Endoxylanases in “Application of Enzymes to Lignocellulosics”, eds S.D. Mansfield and J. N. Saddler. ACS Symposium Series No. 855. Ch 12. pp191–210. 2003.
- StJohn FJ, Rice J, Preston J. Paenibacillus sp. strain JDR-2 and XynA1: a novel system for methylglucuronoxylan utilization. Appl Environ Microbiol 2006; 72:1496–1506. PubMed http://dx.doi.org/10.1128/AEM.72.2.1496-1506.2006PubMed CentralView ArticlePubMedGoogle Scholar
- Chow V, Nong G, Preston J. Structure, function, and regulation of the aldouronate utilization gene cluster from Paenibacillus sp. strain JDR-2. J Bacteriol 2007; 189:8863–8870. PubMed http://dx.doi.org/10.1128/JB.01141-07PubMed CentralView ArticlePubMedGoogle Scholar
- Nong G, Rice J, Chow V, Preston J. Aldouronate utilization in Paenibacillus sp. strain JDR-2: Physiological and enzymatic evidence for coupling of extracellular depolymerization and intracellular metabolism. Appl Environ Microbiol 2009; 75:4410–4418. PubMed http://dx.doi.org/10.1128/AEM.02354-08PubMed CentralView ArticlePubMedGoogle Scholar
- Ma M, Wang C, Ding Y, Li L, Shen D, Jiang X, Guan D, Cao F, Chen H, Feng R, et al. Complete genome sequence of Paenibacillus polymyxa SC2, a strain of plant growth-promoting Rhizobacterium with broad-spectrum antimicrobial activity. J Bacteriol 2011; 193:311–312. PubMed http://dx.doi.org/10.1128/JB.01234-10PubMed CentralView ArticlePubMedGoogle Scholar
- Kim JF, Jeong H, Park SY, Kim SB, Park YK, Choi SK, Ryu CM, Hur CG, Ghim SY, Oh TK, et al. Genome sequence of the polymyxin-producing plant-probiotic rhizobacterium Paenibacillus polymyxa E681. J Bacteriol 2010; 192:6103–6104. PubMed http://dx.doi.org/10.1128/JB.00983-10PubMed CentralView ArticlePubMedGoogle Scholar
- Sirota-Madi A, Olender T, Helman Y, Ingham C, Brainis I, Roth D, Hagi E, Brodsky L, Leshkowitz D, Galatenko V, et al. Genome sequence of the pattern forming Paenibacillus vortex bacterium reveals potential for thriving in complex environments. BMC Genomics 2010; 11:710. PubMed http://dx.doi.org/10.1186/1471-2164-11-710PubMed CentralView ArticlePubMedGoogle Scholar
- Chan QW, Melathopoulos AP, Pernal SF, Foster LJ. The innate immune and systemic response in honey bees to a bacterial pathogen, Paenibacillus larvae. BMC Genomics 2009; 10:387. PubMed http://dx.doi.org/10.1186/1471-2164-10-387PubMed CentralView ArticlePubMedGoogle Scholar
- Tamura K, Dudley J, Nei M, Kumar S. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 2007; 24:1596–1599. PubMed http://dx.doi.org/10.1093/molbev/msm092View ArticlePubMedGoogle Scholar
- Li J, Beatty PK, Shah S, Jensen SE. Use of PCR-targeted mutagenesis to disrupt production of fusaricidin-type antifungal antibiotics in Paenibacillus polymyxa. Appl Environ Microbiol 2007; 73:3480–3489. PubMed http://dx.doi.org/10.1128/AEM.02662-06PubMed CentralView ArticlePubMedGoogle Scholar
- Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 2008; 26:541–547. PubMed http://dx.doi.org/10.1038/nbt1360PubMed CentralView ArticlePubMedGoogle Scholar
- Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA 1990; 87:4576–4579. PubMed http://dx.doi.org/10.1073/pnas.87.12.4576PubMed CentralView ArticlePubMedGoogle Scholar
- Murray RGE. The Higher Taxa, or, a Place for Everything…? In: Holt JG (ed), Bergey’s Manual of Systematic Bacteriology, First Edition, Volume 1, The Williams and Wilkins Co., Baltimore, 1984, p. 31–34.Google Scholar
- Garrity GM, Holt JG. The Road Map to the Manual. In: Garrity GM, Boone DR, Castenholz RW (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 1, Springer, New York, 2001, p. 119–169.View ArticleGoogle Scholar
- Ludwig W, Schleifer KH, Whitman WB. Class I. Bacilli class nov. In: De Vos P, Garrity G, Jones D, Krieg NR, Ludwig W, Rainey FA, Schleifer KH, Whitman WB (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 3, Springer-Verlag, New York, 2009, p. 19–20.Google Scholar
- Euzéby J. List of new names and new combinations previously effectively, but not validly, published. List no. 132. Int J Syst Evol Microbiol 2010; 60:469–472. http://dx.doi.org/10.1099/ijs.0.022855-0View ArticleGoogle Scholar
- Prévot AR. In: Hauderoy P, Ehringer G, Guillot G, Magrou. J., Prévot AR, Rosset D, Urbain A (eds), Dictionnaire des Bactéries Pathogènes, Second Edition, Masson et Cie, Paris, 1953, p. 1–692.Google Scholar
- Skerman VBD, McGowan V, Sneath PHA. Approved Lists of Bacterial Names. Int J Syst Bacteriol 1980; 30:225–420. http://dx.doi.org/10.1099/00207713-30-1-225View ArticleGoogle Scholar
- De Vos P, Ludwig W, Schleifer KH, Whitman WB. Family IV. Paenibacillaceae fam. nov. In: De Vos P, Garrity G, Jones D, Krieg NR, Ludwig W, Rainey FA, Schleifer KH, Whitman B (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 3, Springer-Verlag, New York, 2009, p. 269.Google Scholar
- Ash C, Priest FG, Collins MD. Molecular identification of rRNA group 3 bacilli (Ash, Farrow, Wallbanks and Collins) using a PCR probe test. Proposal for the creation of a new genus Paenibacillus. Antonie van Leeuwenhoek 1993; 64:253–260. PubMed http://dx.doi.org/10.1007/BF00873085View ArticlePubMedGoogle Scholar
- Murray RGE, ed. Validation List no. 51. Validation of the publication of new names and new combinations previously effectively published outside the IJSB. Int J Syst Bacteriol 1994; 44:852. http://dx.doi.org/10.1099/00207713-44-4-852
- Euzéby JP. Taxonomic note: necessary correction of specific and subspecific epithets according to Rules 12c and 13b of the International Code of Nomenclature of Bacteria (1990 Revision). Int J Syst Bacteriol 1998; 48:1073–1075. http://dx.doi.org/10.1099/00207713-48-3-1073View ArticleGoogle Scholar
- Tindall BJ. What is the type species of the genus Paenibacillus? Request for an Opinion. Int J Syst Evol Microbiol 2000; 50:939–940. PubMed http://dx.doi.org/10.1099/00207713-50-2-939View ArticlePubMedGoogle Scholar
- Trüper HG. The type species of the genus Paenibacillus Ash et al. 1994 is Paenibacillus polymyxa. Opinion 77. Judicial Commission of the International Committee on Systematics of Prokaryotes. Int J Syst Evol Microbiol 2005; 55:513. PubMedView ArticleGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene Ontology: tool for the unification of biology. Nat Genet 2000; 25:25–29. PubMed http://dx.doi.org/10.1038/75556PubMed CentralView ArticlePubMedGoogle Scholar
- Sasser M. Microbial Identification by gas chromatographic analysis of fatty acid methyl esters (GC_FAME). MIDI Technical Note 101. MIDI Inc. Newark, DE; 2009.Google Scholar
- MIDI. MIS Operating Manual. MIDI, Inc., Newark, DE 19713; 2002.Google Scholar
- Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005; 437:376–380. PubMedPubMed CentralPubMedGoogle Scholar
- DOE Joint Genome Institute. http://www.jgi.doe.gov
- The Phred/Phrap/Consed software package. http://www.phrap.com
- Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 1998; 8:186–194. PubMedView ArticlePubMedGoogle Scholar
- Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 1998; 8:175–185. PubMedView ArticlePubMedGoogle Scholar
- Gordon D, Abajian C, Green P. Consed: a graphical tool for sequence finishing. Genome Res 1998; 8:195–202. PubMedView ArticlePubMedGoogle Scholar
- Han C, Chain P. Finishing repeat regions automatically with Dupfinisher. In: Arabnia H, Valafar, H, editor. Proceedings of the 2006 International Conference on Bioinformatics & Computational Biology. CSREA Press; 2006. p 141–146.
- Integrated Microbial Genome portal of JGI. http://img.jgi.doe.gov/cgibin/w/main.cgi?section=TaxonDetail&taxon oid=644736396
- NCBI. http://www.ncbi.nlm.nih.gov/sites/entrez?db=gen ome&cmd=Retrieve&dopt=Overview&listuids=1609.
- Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11:119. PubMed http://dx.doi.org/10.1186/1471-2105-11-119PubMed CentralView ArticlePubMedGoogle Scholar
- Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 2007; 35:3100–3108. PubMed http://dx.doi.org/10.1093/nar/gkm160PubMed CentralView ArticlePubMedGoogle Scholar
- Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997; 25:955–964. PubMed http://dx.doi.org/10.1093/nar/25.5.955PubMed CentralView ArticlePubMedGoogle Scholar
- Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res 2003; 31:439–441. PubMed http://dx.doi.org/10.1093/nar/gkg006PubMed CentralView ArticlePubMedGoogle Scholar
- Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 2001; 305:567–580. PubMed http://dx.doi.org/10.1006/jmbi.2000.4315View ArticlePubMedGoogle Scholar
- Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 2004; 340:783–795. PubMed http://dx.doi.org/10.1016/j.jmb.2004.05.028View ArticlePubMedGoogle Scholar