Complete genome sequence of the lignin-degrading bacterium Klebsiella sp. strain BRL6-2

In an effort to discover anaerobic bacteria capable of lignin degradation, we isolated Klebsiella sp. strain BRL6-2 on minimal media with alkali lignin as the sole carbon source. This organism was isolated anaerobically from tropical forest soils collected from the Bisley watershed at the Ridge site in the El Yunque National Forest in Puerto Rico, USA, part of the Luquillo Long-Term Ecological Research Station. At this site, the soils experience strong fluctuations in redox potential and are characterized by cycles of iron oxidation and reduction. Genome sequencing was targeted because of its ability to grow on lignin anaerobically and lignocellulolytic activity via in vitro enzyme assays. The genome of Klebsiella sp. strain BRL6-2 is 5.80 Mbp with no detected plasmids, and includes a relatively small arsenal of genes encoding lignocellulolytic carbohydrate active enzymes. The genome revealed four putative peroxidases including glutathione and DyP-type peroxidases, and a complete protocatechuate pathway encoded in a single gene cluster. Physiological studies revealed Klebsiella sp. strain BRL6-2 to be relatively stress tolerant to high ionic strength conditions. It grows in increasing concentrations of ionic liquid (1-ethyl-3-methyl-imidazolium acetate) up to 73.44 mM and NaCl up to 1.5 M.


Introduction
Lignin is one of the biggest barriers to efficient lignocellulose deconstruction because it occludes the action of cellulases. It is also a major waste stream after lignocellulose deconstruction. Tropical forest soils are the sites of very high rates of decomposition, accompanied by very low and fluctuating redox potential conditions [1,2]. Because early stage decomposition is typically dominated by fungi and the free-radical generating oxidative enzymes phenol oxidase and peroxidase [3,4], we targeted anaerobic tropical forest soils with the idea that they would be dominated by bacterial rather than fungal decomposers. Bacteria grow faster than fungi, allowing higher recombinant enzyme production for commercial use [5]. To discover organisms that were capable of breaking down lignin without the use of oxygen free radicals, we isolated Klebsiella sp. strain BRL6-2 under anaerobic conditions using lignin as the sole carbon source. In addition, this strain was observed to withstand moderately high concentrations of ionic liquids, and thus was targeted for whole genome sequencing.

Organism information
Klebsiella sp. strain BRL6-2 was isolated from soil collected from the Bisley watershed at the Ridge site in the El Yunque experimental forest, part of the Luquillo Long-Term Ecological Research Station in Luquillo, Puerto Rico, USA. A soil slurry was made with 1 gram of soil sample diluted in 100 ml of MOD CCMA media without carbon source, serially diluted and inoculated to roll tubes containing MOD CCMA media with alkali lignin as the C source. MOD CCMA media consists of 2.8 g L −1 NaCl, 0.1 g L −1 KCl, 27 mM MgCl 2 , 1 mM CaCl 2 , 1.25 mM NH 4 Cl, 9.76 g L −1 MES, 1.1 ml L −1 filter sterilized 1 M K 2 HPO 4 , 12.5 ml L −1 trace minerals [6,7], and 1 ml L −1 Thauer's vitamins [8]. Tubes were incubated at room temperature for up to 12 weeks, at which point the colony was picked from a roll tube that had been inoculated with a 10 −4 dilution of soil slurry, grown in 10% tryptic soy broth (TSB), and characterized.
For initial genotyping and for validating the isolation, the small subunit ribosomal RNA gene was sequenced by Sanger sequencing using the universal primers 8 F and 1492R [9]. The 16S rRNA gene sequence places Klebsiella sp. strain BRL6-2 in the domain Bacteria, phylum Proteobacteria, class Gammaproteobacteria, and order Enterobacterales ( Figure 1A). However, small subunit ribosomal RNA (16S rRNA) sequence is not sufficient to clearly define the evolutionary history of this region of the Gammaproteobacteria, so we have also constructed a hierarchical clustering of whole genomes based on pfams [10] ( Figure 1B). This clustering supports the placement of Klebsiella sp. strain BRL6-2 within the order Enterobacterales.  Figure 1 Phylogenetic trees highlighting the position of Klebsiella sp. strain strain BRL6-2 relative to other type and non-type strains within the Gammaproteobacteria, based on (A) 16S ribosomal RNA phylogeny, and (B) whole genome classification based on pfams. Strains are shown with corresponding NCBI genome project ids listed within [11]. The 16S tree uses sequences aligned by the RDP aligner, the Jukes-Cantor corrected distance model to construct a distance matrix based on alignment model positions without the use of alignment inserts, and a minimum comparable position of 200. The tree is built with RDP Tree Builder, which uses Weighbor [12] with an alphabet size of 4 and length size of 1000. The building of the tree also involves a bootstrapping process repeated 100 times to generate a majority consensus tree [13]. The whole genome classification is a hierarchical clustering of pfams groups that was generated using the Integrated Microbial Genomes (IMG) system [14]. Succinimonas amylolytica DSM2873 , Succinatimonas hippei YIT12066, and Tolumonas auensis TA 4 DSM9187 are type strains with genomes available in IMG. All others are non-type strains.

Genome sequencing information Genome project history
The genome was selected based on the ability of Klebsiella sp. strain BRL6-2 to grown on and degrade lignin anaerobically ( Table 1). The genome sequence was completed on 1 February 2013, and presented for public access on April 17, 2014 by Genbank. Finishing was completed at Los Alamos National Laboratory. A summary of the project information is shown in Table 2, which also presents the project information and its association with MIGS version 2.0 compliance [25].

Growth conditions and DNA preparation
Klebsiella sp. strain BRL6-2 grows well aerobically and anaerobically, and was routinely cultivated aerobically in 10% tryptic soy broth (TSB) with shaking at 200 rpm at 30°C. DNA for sequencing was obtained using the Qiagen Genomic-tip kit and following the manufacturer's instructions for the 500/g size extraction. Three column preparations were necessary to obtain 50 μg of high molecular weight DNA. The quantity and quality of the extraction were checked by gel electrophoresis using JGI standards.

Genome sequencing and assembly
The draft genome of Klebsiella sp. strain BRL6-2 was generated at the DOE Joint genome Institute (JGI) using a hybrid of the Illumina and Pacific Biosciences (PacBio) technologies. An Illumina standard shotgun library and long insert mate pair library was constructed and sequenced using the Illumina HiSeq 2000 platform [26]. All general aspects of library construction and sequencing

Genome annotation
Genes were identified using Prodigal [29] as part of the DOE-JGI annotation pipeline [30] followed by a round of manual curation using the JGI GenePRIMP pipeline [31]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG-ER) platform (http://img.jgi.doe.gov/er) [32].

Genome properties
The genome consists of one 5,801,355 bp circular chromosome with no discernable plasmids, and a GC content of 55.24% (Table 3). Of the 5,495 genes predicted, 5,296 were protein-coding genes, and 199 RNAs; 64 pseudogenes were also identified. The majority of the protein-coding genes (86.3%) were assigned with a putative function while the remaining ones were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.

Metabolic characterization using biolog phenotypic microarray
The Biolog phenotypic microarray was used to test Klebsiella sp. strain BRL6-2's utilization of a variety of carbon, nitrogen, phosphorus, and sulfur sources. Different modifications of the isolation medium, MOD CCMA [33], were used to resuspend cells when inoculating different PM plates ( Table 5). The scheme is similar to that used with D. vulgaris in S. Borglin et al. [34]. Plates were done iteratively to optimize each component before proceeding to the next. For all runs, a cell suspension at 0.1 OD 600 and Biolog redox Dye Mix G were used to inoculate the plates. All plates were prepared in duplicate, incubated at 30°C, and read every 15 minutes

CRISPR repeats NA
a) The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome. b) Also includes 54 pseudogenes and 5 other genes.
for 4.5 days. PM1 and PM2 (carbon sources) were prepared anaerobically and aerobically to compare respiration. The anaerobic plates were prepared anaerobically in the anaerobic chamber in degassed medium and sealed in gas tight Whirlpak bags before loading into the Omnilog reader.

Carbon sources
190 different carbon substrates were tested using phenotypic microarray plates. The list of chemical additives that produced the highest increase in respiration relative to background is presented in Table 6. This was measured by the change in redox dye color. D-mannose was used in subsequent plates because of its convenient powder form compared to the viscous Tween solutions, which are mixtures of polyoxyethylene sorbitan esters of saturated fatty acids (predominantly 12:0, 14:0, and 16:0). They are typically used as a surfactant. Although the strain was isolated on lignin, D-cellobiose was utilized at almost the same rate as simpler carbohydrates glucose and xylose, which could suggest possible high cellulolytic activity as well.

Anaerobic vs. aerobic carbon source utilization
There were no significant differences between the aerobic and anaerobic utilization of the PM carbon sources. There is a vertical shift in the respiration curves, which is due to a difference in the starting OL at t = 0, as seen in negative control well A01.
Nitrogen, phosphorus, and sulfur sources 380 nitrogen sources were tested using phenotypic microarray plates. The most utilized nitrogen sources are reported in Table 7. Dipeptide amino acids were some of the most utilized sources, but ammonia from the original MOD CCMA was used in subsequent plates to avoid   adding any other potential carbon source. Based on similar reasoning, phosphate was used for subsequent plates (Table 8). Within the sulfur wells, there was robust respiration in the negative control background well indicating that the buffer MES in the MOD CCMA media can serve as a possible sulfur source (Table 9). Since none of the other sulfur sources produced respiration significantly higher than background, MES will serve as the sulfur source in following plates.

Osmolyte stress response
Klebsiella sp. strain BRL6-2 was tested for respiration in a variety of osmolyte stressors and a range of pH (Table 10), with and without osmoprotectants (Table 11). For these assays, 20 mM D-Mannose MOD CCMA was used to inoculate the osmolyte response assays in Omnilog PM plates 9 and 10. Klebsiella sp. strain BRL6-2 is relatively halotolerant as it grew in increasing concentrations of NaCl up to 9%, which 1.5 M. The addition of trehalose,     glycerol, octopine, and trimethylamine-N-oxide aided respiration in presence of 6% NaCl. The strain was found to be particular sensitive to sodium benzoate out of all the osmolytes tested. Klebsiella sp. strain BRL6-2 was found to respire at faster rates in pH 8-10, with the optimum at pH 8.

Lignocellulose degradation
Because Klebsiella sp. strain BRL6-2 was initially isolated based on colony formation on minimal media with lignin supplied as the sole carbon source [35], we examined the genome to search for genes encoding putative proteins that would be associated with lignin degradation. It has a full protocatechuate pathway for processing catechol degradation to β-ketoadipate, as in Cupriavidus basilensis OR16 and Sphingomonas paucimobilis SYK6 [36,37]. It has six putative peroxidase genes, encoding for glutathione peroxidases, DyP-type peroxidases, and catalases/peroxidases; all are potentially important for lignin degradation [38,39]. It has two putative lactate dehydrogenase genes (EC:1.1.1.28) and two putative catalase genes (EC:1.11.1.6), and no laccase genes. It also has multiple cytochrome oxidase genes suggesting the possible use of lignin as a terminal electron acceptor as was previously observed for a related isolate Enterobacter lignolyticus SCF1 [40]. For the degradation of other relevant lignocellulose components like xylan and cellulose, Klebsiella sp. strain BRL6-2 has 2 xylanase genes, 6 β xylosidase genes, 12 β-glucosidase genes, and 2 endoglucanase genes. Upon isolation of the strain on lignin, Klebsiella sp. strain BRL6-2's ability to degrade several lignocellulose analogs in vitro was measured. Using a 4-methylumbelliferone based enzyme assay that has been previously used on bacterial isolates [35], cells grown in MOD CCMA plus 20 mM Mannose had high levels of β-glucosidase and xylosidase activity with 80% and 28% of the given substrate being degraded within 45 hours. However, it had low activity of cellobiohydrolase. Klebsiella sp. strain BRL6-2 was also tested for CMCase, another important class of cellulase, using a reducing sugar detection assay with 3,5-dinitrosalicylic acid (DNS) reagent and CMC [41]. No activity was detected on CMC. These low activities of cellulases could not be improved by growing cells in MOD CCMA plus 20 mM Mannose supplemented with 0.1% CMC. Although cellulose was a well-utilized substrate from the phenotypic microarray measurements, it may be due to Klebsiella sp. strain BRL6-2's effective β-glucosidase.

Ionic liquid tolerance
Currently, ionic liquids are being investigated for their application to the bioenergy feedstock pretreatment; one of which is 1-ethyl-3-methyl-imidazolium acetate (Emim-Acetate). Klebsiella sp. strain BRL6-2 was tested for growth in 20 mM Mannose MOD CCMA in the presence of 0 mM, 36.72 mM, 73.44 mM, 146.88 mM, 293.75 mM, 587.51 mM Emim-Acetate. A 6% inoculum concentration from a 0.4 OD 600 cell suspension was used to inoculate each treatment. Biolog Dye Mix G was used to monitor cell respiration during the incubation at 30°C within a Biolog reader. Klebsiella sp. strain BRL6-2 could tolerate up to 73.44 mM Emim-Acetate with increased lag phase and decreased final yields with increasing concentrations of Emim-Acetate. This is not as ionic liquid tolerant as Enterobacter lignolyticus SCF1, which was isolated in the same screen and showed tolerance of up to 500 mM 1-ethyl-3-methyl-imidazolium chloride [42]. However, Klebsiella sp. strain BRL6-2 tolerates ionic liquid concentrations higher than most bacterial strains, including E. coli, which were highly sensitive to concentrations as low as 14.69 mM. Klebsiella sp. strain BRL6-2 has 1,107 genes classified as protein coding genes connected to transporters, and these transporters are likley the source of resistance to high ionic strenght, as was also observed in E. lignolyticus SCF1 [42].

Conclusion
Klebsiella sp. strain BRL6-2 is an "Enterobacterales" in the order Gammaproteobacteria, originally isolated based on its ability to grow on lignin as sole carbon source under anaerobic conditions. Its ability to degrade lignin likely has origins in its full protocatechuate pathway, six putative peroxidase genes, two putative lactate dehydrogenase genes, and two putative catalase genes. It also has multiple cytochrome oxidase genes, suggesting the possibility of dissimilatory as well as assimilatory lignin degradation pathways. We also observed high tolerance of ionic strenght conditions, likely facilitated by its many transporter classified genes. Future experiments with Klebsiella sp. strain BRL6-2 should assess its growth kinetics on purified lignin compounds aerobically and anaerobically to determine the extent of its lignin-degrading potential. However, its fast growth, facultative lifestyle, and tolerance to high ionic strength conditions make it an attractive microbial host to bioengineer for industrial lignocellulose degradation and consolidated bioprocessing of biofuels.