High-quality permanent draft genome sequence of Bradyrhizobium sp. Tv2a.2, a microsymbiont of Tachigali versicolor discovered in Barro Colorado Island of Panama

Bradyrhizobiumsp. Tv2a.2 is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from an effective nitrogen-fixing root nodule of Tachigali versicolor collected in Barro Colorado Island of Panama. Here we describe the features of Bradyrhizobiumsp. Tv2a.2, together with high-quality permanent draft genome sequence information and annotation. The 8,496,279 bp high-quality draft genome is arranged in 87 scaffolds of 87 contigs, contains 8,109 protein-coding genes and 72 RNA-only encoding genes. This rhizobial genome was sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project.


Introduction
Legumes engage in nitrogen-fixation symbioses with bacterial partners from at least 13 genera of Proteobacteria [1][2][3][4]. Despite the high extent of phylogenetic diversity of root nodule bacteria, the very broad distribution of one particular genus (Bradyrhizobium) across host legume clades suggests that bacteria in this genus may have been the first legume symbionts [5]. Bradyrhizobium interacts with the widest diversity of legume clades (at least 24 of ca. 33 nodule-forming legume tribes; [6]) and is associated with nodulating groups that represent early branching lineages [7] in all three legume subfamilies [8,9]. Analysis of basal Bradyrhizobium lineages that are associated with early-diverging legume groups may thus shed light on the origins of this symbiosis.
Here we report the genome sequence of one such organism, Bradyrhizobium strain Tv2a.2. Strain Tv2a.2 was sampled in 1997 from the tree Tachigali versicolor on Barro Colorado Island, Panama, a biological preserve with an old-growth moist tropical forest [10]. Tachigali is one of just a handful of nodule-forming genera in the legume Subfamily Caesalpinioideae [11], which is comprised of the earliest branching lineages in the legume family [7]. Tachigali versicolor is a large canopy tree with an unusual monocarpic life history, in which trees grow for decades without flowering. They produce just a single crop of seeds, and then die [12]. Strain Tv2a.2 is a typical representative of the nodule symbionts that are associated with Tachigali in this tropical forest habitat [13], and appears to represent a unique early-diverging lineage of Bradyrhizobium. Phylogenetic analyses have placed Tv2a.2 somewhere near the early split in the genus between two large superclades represented by B. diazoefficiens USDA 110 and B. elkanii USDA 76. However, its exact position near the base of the Bradyrhizobium tree varies to some extent in different analyses, depending on the loci, the strains included, and the method of tree analysis [5,13]. For example, a Bayesian analysis of 16S rRNA sequences from the type strains of 21 Bradyrhizobium species and strain ORS278 placed Tv2a.2 as the earliest diverging Bradyrhizobium lineage [14].
Here we provide an analysis of the complete genome sequence of Tv2a.2, one of the rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project proposal [15], whose properties should help to clarify early events in the diversification of the genus Bradyrhizobium as a whole.

Classification and features
Bradyrhizobium sp. Tv2a.2 is a motile, non-sporulating, non-encapsulated, Gram-negative strain in the order Rhizobiales of the class Alphaproteobacteria. The rod shaped form (Figure 1 Left, Center) has dimensions of approximately 0.5 μm in width and 1.5-2.0 μm in length. It is relatively slow growing, forming colonies after 6-7 days when grown on half strength Lupin Agar (½LA) [16], tryptone-yeast extract agar (TY) [17] or a modified yeast-mannitol agar (YMA) [18] at 28°C. Colonies on ½LA are opaque, slightly domed and moderately mucoid with smooth margins (Figure 1 Right). Figure 2 shows the phylogenetic relationship of Bradyrhizobium sp. Tv2a.2 in a 16S rRNA gene sequence based tree. This strain is phylogenetically the most related to Bradyrhizobiumsp. EC3.3 based on a 16S rRNA gene sequence identity of 99.31% as determined using BLAST analysis [19]. Tv2a.2 is also related to the type strains Bradyrhizobium ingae BR 10250 T and Bradyrhizobium iriomotense EK05 T with 16S rRNA gene sequence identities of 99.16 % and 99.08%, respectively, based on results from the EzTaxon-e server [20,21].
Minimum Information about the Genome Sequence (MIGS) of Tv2a.2 is provided in Table 1 and Additional file 1: Table S1.

Symbiotaxonomy
Bradyrhizobium strain Tv2a.2 was isolated from nodules of Tachigali versicolor found in a tropical forest on Barro Colorado Island, Panama [10]. Due to the highly erratic pattern of seed production from this host, no seeds of this legume were available to authenticate the symbiotic proficiency of strain Tv2a.2. Nodulation and nitrogen fixation was therefore tested on two promiscuous legumes (Vigna unguiculata, Macroptilium atropurpureum) and revealed that nodules could only develop on M. atropurpureum. Acetylene reduction assays also showed that these nodules lacked nitrogenase activity [13]. A further indication that Tv2a.2 may be relatively host-specific is the fact that extensive sampling of other legume hosts in Panama (and elsewhere in the Neotropics) have never recovered strains belonging to the Tv2a.2 lineage from any legume taxa other than T. versicolor [9].

Genome project history
This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Genomic Encyclopedia of Bacteria and Archaea, Root Nodulating Bacteria (GEBA-RNB) project at the U.S. Department of Energy, Joint Genome Institute (JGI). The genome project is deposited in the Genomes OnLine Database [22] and a high-quality permanent draft genome sequence in IMG [23]. Sequencing, finishing and annotation were performed by the JGI using state of the art sequencing technology [24]. A summary of the project information is shown in Table 2.

Growth conditions and genomic DNA preparation
Bradyrhizobium sp. Tv2a.2 was cultured to mid logarithmic phase in 60 ml of TY rich media on a gyratory shaker at 28°C [25]. DNA was isolated from the cells using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method [26].

Genome sequencing and assembly
The draft genome of Bradyrhizobium sp. Tv2a.2 was generated at the DOE Joint Genome Institute (JGI) using the Illumina technology [27]. An Illumina standard shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform which generated 8,336,316 reads totaling 1250.45 Mbp. All general aspects of library construction and sequencing were performed at the JGI and details can be found on the JGI website [28]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts (Mingkun L, Copeland A, Han J, Unpublished). Following steps were then performed for assembly: (1) filtered Illumina reads were assembled using Velvet (version 1.1.04) [29], (2) 1-3 Kbp simulated paired end reads were created from Velvet contigs using wgsim [30], (3) Illumina reads Figure 2 Phylogenetic tree highlighting the position of Bradyrhizobium sp. Tv2a.2 (shown in blue print) relative to other type and non-type strains in the Bradyrhizobium genus using a 1,310 bp intragenic sequence of the 16S rRNA gene. Azorhizobium caulinodans ORS 571 T sequence was used as an outgroup. All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [41]. The tree was built using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Strains with a genome sequencing project registered in GOLD [22] have the GOLD ID mentioned after the strain number and are represented in bold, otherwise the NCBI accession number is provided.

Genome annotation
Genes were identified using Prodigal [32], as part of the DOE-JGI genome annotation pipeline [33,34]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information nonredundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [35] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [36]. Other non-coding RNAs such as the RNA components of the protein

Competing interests
The authors declare that they have no competing interests.
Authors' contributions MP supplied the strain and background information for this project and the DNA to the JGI, TR performed all imaging, TR and WR drafted the paper, MNB and NAB provided financial support and all other authors were involved in sequencing the genome and/or editing the final paper. All authors read and approved the final manuscript.