High-quality draft genome sequence of the Thermus amyloliquefaciens type strain YIM 77409T with an incomplete denitrification pathway

Thermus amyloliquefaciens type strain YIM 77409T is a thermophilic, Gram-negative, non-motile and rod-shaped bacterium isolated from Niujie Hot Spring in Eryuan County, Yunnan Province, southwest China. In the present study we describe the features of strain YIM 77409T together with its genome sequence and annotation. The genome is 2,160,855 bp long and consists of 6 scaffolds with 67.4 % average GC content. A total of 2,313 genes were predicted, comprising 2,257 protein-coding and 56 RNA genes. The genome is predicted to encode a complete glycolysis, pentose phosphate pathway, and tricarboxylic acid cycle. Additionally, a large number of transporters and enzymes for heterotrophy highlight the broad heterotrophic lifestyle of this organism. A denitrification gene cluster included genes predicted to encode enzymes for the sequential reduction of nitrate to nitrous oxide, consistent with the incomplete denitrification phenotype of this strain.


Introduction
Thermus species have been isolated from both natural and man-made thermal environments such as hot springs, hot domestic water, deep mines, composting systems, and sewage sludge [1][2][3][4][5]. The genus has attracted considerable attention as a source of thermostable enzymes, which have important biotechnological applications [6], and as a model organism to study the mechanisms involved in bacterial adaptation to extreme environments [7]. Members of the genus Thermus were formerly considered to be strictly aerobic, based on the characteristics of the type species Thermus aquaticus [2]. However, many studies have shown that Thermus strains also can grow as facultative anaerobes using nitrogen oxides, sulfur, or metals as terminal electron acceptors under oxygen-deprived conditions [8][9][10]. Cava et al. [11] demonstrated that different T. thermophilus strains can grow anaerobically by reducing nitrate to nitrite or by reducing nitrite to a gaseous nitrogen product.
The nitrogen biogeochemical cycle has been investigated in a few geothermal systems [12], including Great Boiling Spring, a~80°C hot spring in the U.S. Great Basin [13][14][15]. Studies in GBS revealed a high flux of nitrous oxide, particularly in the~80°C source pool, suggesting the importance of incomplete denitrifiers in high-temperature environments. A subsequent cultivation and physiological study of heterotrophic denitrifiers suggested a significant role of T. oshimai and T. thermophilus in denitrification in this hot spring [16]. A following study of the whole genomes of one strain from each species, T. oshimai JL-2 and T. thermophilus JL-18, revealed that they have genes encoding the sequential reduction of nitrate to nitrous oxide but lack genes encoding the nitrous oxide reductase, and explains their incomplete denitrification phenotype [17].
Thermus amyloliquefaciens strain YIM 77409 T was isolated in the course of an investigation of the culturable thermophiles that inhabit geothermal springs in Yunnan Province, southwest China [18]. Strain YIM 77409 T was cultured from a sediment sample collected from Niujie Hot Spring using the serial dilution technique on T5 agar. This organism was able to grow anaerobically using nitrate as a terminal electron acceptor, and may potentially impact the nitrogen biogeochemical cycle. Here we describe a summary classification and a set of the features of Thermus amyloliquefaciens type strain YIM 77409 T , together with the genome sequence description and annotation. This work may help to better understand the physiological characters as well as the ecological role of this organism in hot spring ecosystems.

Classification and features
A taxonomic study using a polyphasic approach placed strain YIM 77409 T in the genus Thermus within the family Thermaceae of the phylum Deinococcus-Thermus and resulted in the description of a novel species, Thermus amyloliquefaciens, according to its ability to digest starch [18]. The highest 16S rRNA gene sequence pairwise similarities for strain YIM 77409 T were found with the type strain of T. scotoductus SE-1 T (97.6 %), T.
antranikianii HN3-7 T (96.6 %), T. caliditerrae YIM 77925 T (96.5 %), and T. tengchongensis YIM 77924 T (96.1 %) using EzTaxon-e [19]. The sequence similarities were less than 96.0 % with all other species. Phylogenetic analyses based on the 16S rRNA gene sequences show that YIM 77409 T together with T. caliditerrae, T. scotoductus, T. antranikianii, and T. tengchongensis constitute a distinct monophyletic group within the genus Thermus (Fig. 1). The DNA-DNA hybridization value between strains YIM 77409 T and T. scotoductus SE-1 T was 30.6 ± 1.6 % [18], which was lower than the threshold value (70 %) for the recognition of microbial species [20]. Similarly, the average nucleotide identity (ANI) score Fig. 1 Maximum-likelihood phylogenetic tree of the genus Thermus to highlight the position of Thermus amyloliquefaciens strain YIM 77409 T . The tree was reconstructed based on 1374 aligned positions that remained after the application of the Lane mask to the 16S rRNA gene sequences using MEGA 5.0 [54]. Complete deletion of gaps and missing data and Kimura's two-parameter model was applied. Bootstrap analysis was based on 1000 resamplings. Nodes supported in >75 % (black circles) or >50 % (grey circles) of bootstrap pseudoreplicates (1000 resamplings) for both maximum-likelihood and neighbor-joining methods are indicated. Bar, 0.02 changes per nucleotide. The number of genomes available for each species is included in parentheses (see Table 5) and the asterisk indicates that the genome of the type strain is available. The 16S rRNA gene sequences from Marinithermus hydrothermalis T1 T /AB079382 and Rhabdothermus arcticus 2 M70-1 T /HM856631 were used as outgroups Fig. 2 Scanning electron microscopy image of Thermus amyloliquefaciens strain YIM 77409 T grown in Thermus medium broth at 65°C for 24 h between the two strains based on genome-wide comparisons was 86.6 %, according to the algorithm proposed by Goris et al. [21], which is lower than the ANI threshold range (95-96 %) for species demarcation [22]. Those results indicate that strain YIM 77409 T represents a distinct genospecies in the genus Thermus [18].
Strain YIM 77409 T is Gram-negative, facultatively anaerobic, non-motile, and rod shaped (Fig. 2). Cells are 0.4-0.6 μm wide and 1.5-4.5 μm long. Colonies grown on an R2A, T5, and Thermus agar plates for 2 days are yellow and circular. The strain degrades starch and is positive for nitrate reduction. The predominant menaquinone is MK-8. Major fatty acids (>10 %) are iso-C15:0 and iso-C17:0. The polar lipids consist of aminophospholipid, one unidentified phospholipid, and two unidentified glycolipids. Minimum Information about the Genome Sequence [23] of type strain YIM 77409 T is provided in Table 1.

Genome sequencing information
Genome project history T. amyloliquefaciens strain YIM 77409 T was selected for whole genome sequencing based on its phylogenetic position, denitrifying phenotype, and also for its biotechnological potential. Comparison of the genome of this organism to that of other sequenced Thermus species may provide insights into the molecular basis of the denitrification process in this genus. The genome project Phylum Deinococcus-Thermus TAS [46] Class Deinococci TAS [47,48] Order Thermales TAS [48,49] Family Thermaceae TAS [48,50] Genus Thermus TAS [2,51,52] Species Thermus amyloliquefaciens TAS [18] Type strain: YIM 77409 T TAS [18] Gram stain Negative TAS [18] Cell shape Rod TAS [18] Motility Non-motile TAS [18] Sporulation Nonsporulating TAS [18] Temperature range 50-70°C TAS [18] Optimum temperature 60-65°C TAS [18] pH range; Optimum 6.0-8.0; 7.0 TAS [18] Carbon source Glucose, sucrose, glycerol, maltose, raffinose, trehalose, rhamnose, inositol, xylitol, mannitol, sodium malate, mannose and L-arabinose for strain YIM 77409 T was deposited in the Genomes OnLine Database [24] and the complete sequences were deposited in GenBank. Sequencing, finishing, and annotation were performed by the Department of Energy Joint Genome Institute (Walnut Creek, CA, USA) using state of the art sequencing technology [25]. A summary of the project information associated with MIGS version 2.0 compliance [23] is shown in Table 2.
Growth conditions and genomic DNA preparation T. amyloliquefaciens type strain YIM 77409 T was grown aerobically in Thermus medium at 65°C for 2 days [18] and DNA was isolated from 0.5-1.0 g of cell pellet using the Joint Genome Institute CTAB bacterial genomic DNA isolation protocol [26].

Genome sequencing and assembly
The draft genome of T. amyloliquefaciens type strain YIM 77409 T was generated at the DOE JGI using Pacific Biosciences sequencing technology [27]. A PacBio SMRTbell™ library was constructed and sequenced on the PacBio RS platform using three SMRT cells, which generated 264,235 filtered subreads totaling 751.5 Mbp with an N50 contig length of 2,065,958 bp. All general aspects of library construction and sequencing can be found at the JGI website. All raw reads were assembled using HGAP version 2.1.1 [28]. The final draft assembly produced 6 contigs in 6 scaffolds, totaling 2.16 Mbp in size. The input read coverage was 384.9 × .

Genome annotation
Genes were identified using Prodigal [29] as part of the JGI microbial annotation pipeline [30], followed by a round of manual curation using the JGI GenePRIMP pipeline [31]. The predicted coding sequences were translated and used to search against the Integrated Microbial Genomes non-redundant database, UniProt, TIGRfam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. The rRNA genes are predicted using hmmsearch tool from the package HMMER 3.0 [32] and a set of in-house curated HMMs derived from an alignment of full-length rRNA genes selected from IMG isolate genomes; tRNA genes were found using tRNAscan-SE 1.3.1 [33]; other non-coding RNAs and regulatory RNA features were found by searching the genome for the corresponding Rfam profiles using INFERNAL 1.0.2 package [34]. Additional gene prediction analysis and manual functional annotation was performed using the Integrated Microbial Genomes Expert Review platform developed by the JGI [35]. The analysis of the genome presented here and the annotations are for the version available through IMG (2579778517).

Genome properties
The T. amyloliquefaciens YIM 77409 T high quality draft genome is 2,160,855 bp long with a 67.4 % G + C content. The genomes comprise 2,257 protein-coding genes and 56 RNA genes. The coding regions accounted for 94 % of the whole genome and 1,839 genes were assigned to a putative function with the remaining  The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome b Pseudogenes may also be counted as protein coding or RNA genes, so is not additive under total gene count annotated as hypothetical proteins. A total of 1,558 genes (67.4 %) were assigned to COGs. The properties and the statistics of the genome are presented in Table 3. The distribution of genes into COG functional categories is presented in Table 4.

Insights from the genome sequence
Comparisons with other Thermus spp. genomes Twenty-two Thermus genomes from 12 different species have been sequenced, including T. amyloliquefaciens type strain YIM 77409 T , and 7 of them have finished genome sequences. The phylogenetic coverage of these genomes is shown in Fig. 1 and their basic properties are summarized in Table 5 and GC contents (67.4 %) of strain YIM 77409 T are around the average value, but the gene number of this strain is lower than the average, possibly indicating gene loss through genomic streamlining in this species. In addition, the percentage of protein-coding genes with functional prediction (79.5 %) is higher than the average, whereas the percentage of proteincoding genes with COGs (67.4 %) is similar to the average of the genus Thermus.

Genes involved in denitrification
Denitrification is a respiratory process to reduce nitrate or nitrite stepwise to nitrogen gas (NO 3 − → NO 2 − → NO → N 2 O → N 2 ), and plays a major role in converting bioavailable nitrogen to recalcitrant dinitrogen gas [38]. Denitrification normally occurs under oxygen-limiting conditions, and is catalyzed by four types of nitrogen oxide reductases in sequence: nitrate reductase (Nar or Nap), nitrite reductase (Nir), nitric oxide reductase (Nor), and nitrous oxide reductase (Nos) [39,40]. Previous studies have demonstrated that some Thermus species have incomplete denitrification phenotypes terminating with the production of nitrite or nitrous oxide [16,41]. This incomplete denitrification is partly encoded by a conjugative element (nitrate conjugative element, NCE) that can be transferred among strains [42]. The NCE is composed of two main operons, nar and nrc, and the transcription factors DnrS and DnrT, which are required for their expression under anaerobic conditions when nitrate is present [43,44]. The periplasmic nitrate reductase subunits NapB and NapC were not found in the genome of T. amyloliquefaciens YIM 77409 T , consistent with the use of the Nar system in the Thermales. Figure 3 shows the organization of the nar operon and neighboring genes involved in denitrification in T. amyloliquefaciens YIM 77409 T , T. tengchongensis YIM 77401, and T. scotoductus SA-01. They are located on the chromosome in strains YIM 77409 T and YIM 77401, as in T. scotoductus SA-01. However, these gene clusters are located on megaplasmids in T. thermophilus and T. oshimai strains [17]. The nar operons show a high degree of synteny and consist of narCGHJIKT encoding the associated periplasmic cytochrome NarC, the membrane-bound nitrate reductase (NarGHI), the dedicated chaperone NarJ, the nitrate/proton symporter (NarK1), which might also function in nitrite extrusion in T. thermophilus HB8 T , and the nitrate/nitrite antiporter (NarK2). Regulatory protein A and a denitrification regulator gene operon dnrST are adjacent to the nar operons. Strain YIM 77409 T contains a putative nirS, which encodes the isofunctional tetraheme cytochrome cd1-containing nitrite reductase. The nirK, encoding a Cu-containing nitrite reductase in T. scotoductus SA-01, is absent in strain YIM 77409 T and YIM 77401. Genes encoding conserved hypothetical proteins, coenzyme PQQ synthesis protein (PqqE), and nitric oxide reductase subunit b (NorB) and c (NorC) were also presented in the YIM 77409 T genome. Genes encoding the periplasmic multicopper enzyme nitrous oxide reductase (Nos), which catalyzes the last step of the denitrification (N 2 O → N 2 ), were amyloliquefaciens YIM 77409 T , T. tengchongensis YIM 77401, and T. scotoductus SA-01. Fe: heme protein-containing nitrite reductase, Cu: coppercontaining nitrite reductase. Numbers below the genes indicate the provisional ORF numbers in T. amyloliquefaciens YIM 77409 T and T. tengchongensis YIM 77401, the locations in the chromosome are indicated below. nar: nitrate reductase gene; nir: nitrite reductase gene; nor: nitric oxide reductase gene; dnr: denitrification regulator gene [43,55,56]. This figure is modified from Murugapiran et al. [17] not observed in the YIM 77409 T genome or in any Thermus spp. genomes. Physiological experiments with nitrate as the sole terminal electron acceptor also confirm that strain YIM 77409 T can convert nitrate to nitrous oxide under anaerobic conditions, but not to nitrogen gas.

Conclusions
The genus Thermus is the archetypal thermophilic bacterium and has been isolated from both natural and man-made thermal environments. Members of this genus are of significance as a source of thermophilic enzymes of great biotechnological interest and as an excellent laboratory models to study the molecular basis of thermal stability. Here, we report the annotation of a high quality draft genome sequence of Thermus amyloliquefaciens YIM 77409 T . Analysis of the genome revealed that strain YIM 77409 T encodes enzymes involved in complete glycolysis, pentose phosphate pathway, tricarboxylic acid cycle, pyruvate dehydrogenase, and pentose phosphate pathway. The genome sequence of strain YIM 77409 T provides insights to better understand the molecular mechanisms of the incomplete denitrification phenotype and the ecological roles that Thermus species play in nitrogen cycling. Combined analysis of this genome and other Thermus genomes also provides important insights into the evolution and ecology of this group and the role it may play in the high-temperature nitrogen biogeochemical cycle.