High-quality permanent draft genome sequence of Bradyrhizobium sp. strain WSM1743 - an effective microsymbiont of an Indigofera sp. growing in Australia

Bradyrhizobium sp. strain WSM1743 is an aerobic, motile, Gram-negative, non-spore-forming rod that can exist as a soil saprophyte or as a legume microsymbiont of an Indigofera sp. WSM1743 was isolated from a nodule recovered from the roots of an Indigofera sp. growing 20 km north of Carnarvon in Australia. It is slow growing, tolerates up to 1 % NaCl and is capable of growth at 37 °C. Here we describe the features of Bradyrhizobium sp. strain WSM1743, together with genome sequence information and its annotation. The 8,341,956 bp high-quality permanent draft genome is arranged into 163 scaffolds and 167 contigs, contains 7908 protein-coding genes and 75 RNA-only encoding genes and was sequenced as part of the Root Nodule Bacteria chapter of the Genomic Encyclopedia of Bacteria and Archaea project. Electronic supplementary material The online version of this article (doi:10.1186/s40793-015-0073-2) contains supplementary material, which is available to authorized users.


Introduction
Rhizobia are soil-dwelling bacteria that have acquired the ability to establish associations with leguminous plants to symbiotically fix nitrogen. After infection of the plant, the rhizobia become established within root nodules and can fix atmospheric dinitrogen gas into ammonia using a reaction that is catalyzed by the nitrogenase enzyme [1]. The export of fixed nitrogen to the plant improves growth and productivity under Nlimiting environmental conditions. The effective use of the symbiosis leads to sustainable cropping systems with a net positive impact on the environment [2]. In Australia, the majority of productive legumes and their rhizobia in agricultural systems have been deliberately, or accidentally, introduced since European settlement [3]. However, recently, there has been an interest in the diversity of Australian native legumes and their microsymbionts [4].
The northwest of Western Australia is an ideal landscape to discover rhizobia nodulating indigenous legume flora [4] and is an area low in introduced legumes and inoculants. In 1996, an extensive survey was conducted of the area revealing a range of indigenous legume genera including a number of Indigofera spp. [4]. In Australia, this species has been found at dispersed locations in the Northern Territory, Queensland and Western Australia on dark brown clay loams and frequently on lands under cultivation [5]. The Australian Indigofera spp., based on their habitat, can be placed into three categories; i) shrubs, including I. brevidens, I. australis, I. adesmiifolia and some members of the I. pratensis group, which occur mainly on better soil types in the east coast, ii) perennial herbs, such as I. baileyi, I. efoliata, I. triflora, I. georgei, I. rugosa and members of the I. triflora and I. pratensis groups, which occur in the more arid, or seasonally dry, parts of Australia, iii) annual herbs, uncommon amongst the endemic species, including the two annual species, I. haplophylla and I. ammobia, which occur in the monsoon tropics and the Tanami and Great Sandy Deserts, respectively [6]. The native species with wide extra-Australian distributions (particularly I. colutea, I. hirsuta, I. linnaei and I. linifolia) occur in a variety of habitats, mostly towards the northern parts of Australia. It is likely that these taxa now inhabit a greater range than they did before European settlement, and the Australian populations of these species may have been augmented by the introduction of seed from non-Australian sources [6].
Since there is a paucity of information regarding microsymbionts of Indigofera, a collection of root nodules was therefore obtained from the most prevalent Indigofera spp. present in northwest Australia and the microsymbionts from these nodules were then isolated. One microsymbiont, Bradyrhizobium sp. strain WSM1743, was isolated from a nodule recovered from the roots from an indigenous Indigofera sp. growing in red-brown sandy loam 40 m above sea level. The plant was located in natural bush land, approximately 20 km Northeast of the town Carnarvon in Western Australia [4]. The collection area has a warm semi-arid climate with a long-term mean seasonal rainfall of 226 mm.
Strain WSM1743 was identified as a Bradyrhizobium sp. based on 16S rRNA typing [4]. Most Bradyrhizobium spp., including WSM1743, cannot grow on sucrose or lactose, which may indicate the lack of a disaccharide uptake system [7]. However, WSM1743, unlike other Bradyrhizobium spp., is able to grow at 37°C and this ability could be a specific adaptation to the high soil temperatures experienced in the northwest of Western Australia [4]. Here we present a summary classification and a set of general features for this microsymbiont together with a description of its genome sequence and annotation done as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria project [8] (Additional file 2).

Organism information Classification and features
Bradyrhizobium sp. strain WSM1743 is a motile, Gram-negative non-spore-forming rod ( Fig. 1 Left and Center) in the order Rhizobiales of the class Alphaproteobacteria. It is slow growing, forming colonies within 7-10 days when grown on half strength Lupin Agar [9] at 28°C. Colonies on ½LA are white-opaque, slightly domed and moderately mucoid with smooth margins (Fig. 1 Right). This strain was isolated together with 7 other bacteria from native Indigofera plants and physiologically characterised. Strain WSM1743 was identified as slow growing with poor growth in 1 % NaCl, no growth at 2 to 3 % NaCl and average growth in pH5 to 9 [4]. Bradyrhizobium type strains have a slow generation time (9 to 18 h) and fail to grow in media containing 2 % NaCl [10], which indicates that WSM1743 belongs to this genus. The maximal growth temperature for most Bradyrhizobium strains is 33 to 35°C, with many strains failing to grow above 34°C [10]. However, WSM1743 was able to grow at 37°C on ½LA medium [4], and therefore extending the temperature range for Bradyrhizobium. Figure 2 shows the phylogenetic relationship of Bradyrhizobium sp. strain WSM1743 in a 16S rRNA gene sequence based tree. This strain is phylogenetically the most related to the RNB type strains B. japonicum USDA 6 T , B. lupini DSM30140 T and B. yuanmingense LMG 21827 T with sequence identities to the WSM1743 16S rRNA gene sequence of 99.78 %, 99.71 % and 99.63 %, respectively, as determined using the EzTaxon-e server [11]. B. japonicum USDA6 T was originally isolated in Japan from Glycine max root nodules and is able to nodulate and fix nitrogen effectively with several other Glycine species and Macroptillium atropurpureum [12]. B. lupini DSM30140 T is a microsymbiont of Lupinus luteus and Lupinus angustifolius [13]. B. yuanmingense B071 T was isolated from Lespedeza cuneata root nodules from China but is also able to nodulate and fix nitrogen effectively with Vigna unguiculata and Glycyrrhiza uralensis [14]. Additionally, a recent report showed that B. yuanmingense and B. japonicum are the preferred microsymbionts of Vigna unguiculata and Vigna radiate in the subtropical region of China [15].
Minimum Information about the Genome Sequence (MIGS) [16] of WSM1743 is provided in Table 1 and Additional file 1: Table S1.

Symbiotaxonomy
Bradyrhizobium sp. strain WSM1743 was isolated from Indigofera sp. nodules collected at site 32, north of Carnarvon, Western Australia [4]. The site of collection contained several Australian native legumes, with a soil pH of 7.5. Symbiotic interactions of Bradyrhizobium sp. strain WSM1743 were assessed on three annual, one biennial and nine perennial exotic legume species that Fig. 2 Phylogenetic tree highlighting the position of Bradyrhizobium sp. strain WSM1743 (shown in blue print) relative to other type and non-type strains in the Bradyrhizobium genus using a 1,251 bp internal region of the 16S rRNA gene. Azorhizobium caulinodans LMG 6465 T sequence was used as an outgroup. All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [17]. The tree was built using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Strains with a genome sequencing project registered in GOLD [18] are shown in bold and have the GOLD ID mentioned after the strain number, otherwise the NCBI accession number is provided have agricultural use, or potential use, in southern Australia. WSM1743 consistently nodulated with the exotic legume species, Macroptilium atropurpureum and Phaseolus vulgaris, inconsistently with Ononis natrix and did not form nodules with Argyrolobium uniflorum, Chamaecytisus proliferus, Sutherlandia microphylla, Hedysarum coronarium, Medicago sativa, Ornithopus sativus, O. compressus, Trifolium burchellianum, T. polymorphum and T. uniflorum. Strain WSM1743 was able to consistently nodulate the Australian native legumes, Acacia saligna, Kennedia prorepens and K. coccinea, but could not nodulate Swainsona pterostylis, S. formosa and S. macculochiana [4]. Additionally it was noted that the isolate could not nodulate Indigofera brevidens, an indigenous Indigofera found in the same location as the host of WSM1743 [4].

Genome project history
This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Genomic Encyclopedia of Bacteria and Archaea, The Root Nodulating Bacteria chapter (GEBA-RNB) project at the U.S. Department of Energy, Joint Genome Institute [8]. The genome project is deposited in the Genomes On-Line Database [18] and the high-quality permanent draft genome sequence in IMG [26]. Sequencing, finishing and annotation were performed by the JGI using state of the art sequencing technology [27]. A summary of the project information is shown in Table 2.

Growth conditions and genomic DNA preparation
Bradyrhizobium sp. strain WSM1743 was cultured to mid logarithmic phase in 60 ml of TY rich media on a gyratory shaker at 28°C [29]. DNA was isolated from the cells using a Cetyl trimethyl ammonium bromide bacterial genomic DNA isolation method [30].

Genome sequencing and assembly
The draft genome of Bradyrhizobium sp. strain WSM1743 was generated at the DOE Joint Genome  Institute [31]. An Illumina standard shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform, which generated 14,683,452 reads totaling 2.2 Gbp. All general aspects of library construction and sequencing performed at the JGI can be found at the JGI web site [31]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts (Mingkun L, Copeland A, Han J. unpublished). Artifact filtered sequence data was then screened and trimmed according to the k-mers present in the dataset (Mingkun L, Copeland A, Han J. unpublished).
High-depth k-mers, presumably derived from MDA amplification bias, cause problems in the assembly, especially if the k-mer depth varies in orders of magnitude for different regions of the genome. Reads with high k-mer coverage (>30x average k-mer depth) were normalized to an average depth of 30x. Reads with an average kmer depth of less than 2x were removed. Following steps were then performed for assembly: (1) normalized Illumina reads were assembled using Velvet version 1.1.04 [32] (2) 1-3 Kbp simulated paired end reads were created from Velvet contigs using wgsim [33] (3) normalized Illumina reads were assembled with simulated read pairs using Allpaths-LG (version r39750) [34]. Parameters for assembly steps were

Genome annotation
Genes were identified using Prodigal [35], as part of the DOE-JGI genome annotation pipeline [36,37]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information nonredundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [38] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [39]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [40]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes-Expert Review system [41] developed by the Joint Genome Institute, Walnut Creek, CA, USA.

Genome properties
The genome is 8,341,956 nucleotides with 63.37 % GC content ( Table 3) and comprised of 163 scaffolds of 167 contigs. From a total of 7983 genes, 7908 were protein encoding and 75 RNA only encoding genes. The majority of genes (71.51 %) were assigned a putative function whilst the remaining genes were annotated as hypothetical. The distribution of genes into COG functional categories is presented in Table 4.

Conclusion
Bradyrhizobium sp. WSM1743 belongs to a group of Alpha-rhizobia microsymbionts from native Australian legumes and was isolated from a nodule of an Indigofera species growing 20 km north of Carnarvon in northwestern Australia. Phylogenetic analysis revealed that WSM1743 is most closely related to B. japonicum USDA 6 T , which was obtained from Glycine max root nodules from Japan and is able to nodulate and fix nitrogen effectively with several other Glycine species and Macroptillium atropurpureum [12]. Strain WSM1743 has been shown to nodulate with Macroptilium atropurpureum and endemic Australian legumes including Acacia saligna, Kennedia prorepens and K. coccinea [4].  The total is based on the total number of protein coding genes in the genome