High-quality permanent draft genome sequence of Rhizobium sullae strain WSM1592; a Hedysarum coronarium microsymbiont from Sassari, Italy

Rhizobium sullae strain WSM1592 is an aerobic, Gram-negative, non-spore-forming rod that was isolated from an effective nitrogen (N2) fixing root nodule formed on the short-lived perennial legume Hedysarum coronarium (also known as Sulla coronaria or Sulla). WSM1592 was isolated from a nodule recovered from H. coronarium roots located in Ottava, bordering Sassari, Sardinia in 1995. WSM1592 is highly effective at fixing nitrogen with H. coronarium, and is currently the commercial Sulla inoculant strain in Australia. Here we describe the features of R. sullae strain WSM1592, together with genome sequence information and its annotation. The 7,530,820 bp high-quality permanent draft genome is arranged into 118 scaffolds of 118 contigs containing 7.453 protein-coding genes and 73 RNA-only encoding genes. This rhizobial genome is sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project. Electronic supplementary material The online version of this article (doi:10.1186/s40793-015-0020-2) contains supplementary material, which is available to authorized users.


Introduction
The accessibility and supply of nitrogen fertilizer is an ever-increasing challenge that world agriculture faces [1]. Despite the fact the Earth's atmosphere consists of approximately 78 % dinitrogen, it is in a form that must be converted before it can be utilised by plants [2]. Conversion of N 2 can be achieved by the chemical synthesis of natural gas but these methods can be considered unsustainable because of the use of exhaustible and costly fossil fuel resources [3]. In addition, the manufacturing process not only increases the greenhouse gas emissions but also field N fertiliser application have been directly linked to contaminating and leading to detrimental effects in ecosystems and waterways. Alternatively, a more sustainable and environmentally friendly process of acquiring N is through the biological process of N fixation by diazotrophs [2]. Most biological fixation in world agriculture is provided from the process of symbiotic nitrogen fixation, which occurs following the successful formation of an effective symbiosis by leguminous plants and bacterial microsymbionts [4].
The productive efficiencies of SNF in agricultural areas rely on considerable efforts by researchers and producers in matching suitable legume hosts with their compatible microsymbionts [5]. Some agricultural areas farm with indigenous legumes, while others embark on introducing exotic legumes and their compatible microsymbionts from different geographical locations that are edaphically and climatically suited to their own [4]. In Australia for instance, selection programs have enabled the domestication of new Mediterranean legume species and their microsymbionts [6]. One such grazing legume species commercially introduced into Australian and New Zealand agriculture includes the Papilionoid legume Hedysarum coronarium (also known as Sulla coronaria or Sulla). Sulla is a deeprooted, short-lived perennial pasture legume that is grown throughout Mediterranean countries where it is fed green, used for silage or as hay [7]. It is noted that the microsymbionts of Sulla display a high level of specificity for nodulation and nitrogen fixation [8]. However, when effectively nodulated Sulla plants have the ability to biologically fix large amounts of nitrogen for increased paddock fertility [9].
Rhizobium sullae strain WSM1592 is the current Australian commercial inoculant for Sulla after replacing strain CC1335 in 2006. This strain has also been deposited in the Western Australian Soil Microbiology collection and is available for research. WSM1592 was isolated in 1995 from a nodule collected from a Sulla plant sampled on a roadside in calcareous loamy sand near the Ottava agriculture research farm, east of Sassari in Sardinia, Italy. The location has a Mediterranean climate with a long-term mean seasonal rainfall of 547 mm. Here we present a preliminary description of the general features for Rhizobium sullae strain WSM1592 together with its genome sequence and annotation.

Organism information
Classification and features R. sullae strain WSM1592 is a motile, Gram-negative rod ( Fig. 1 Left and Center) in the order Rhizobiales of the class Alphaproteobacteria. It is fast growing, forming colonies within 3-4 days when grown on half strength Lupin Agar (½LA) [10], tryptone-yeast extract agar (TY) [11] or a modified yeast-mannitol agar (YMA) [12] at 28°C. Colonies on ½LA are white-opaque, slightly domed and moderately mucoid with smooth margins (Fig. 1 Right). Figure 2 shows the phylogenetic relationship of R. sullae strain WSM1592 in a 16S rRNA gene sequence based tree. This strain is phylogenetically the most related to Rhizobium sullae IS 123 T , Rhizobium leguminosarum USDA 2370 T and Rhizobium phaseoli ATCC 14482 T with sequence identities to the WSM1592 16S rRNA gene sequence of 100 %, 99.84 % and 99.84 %, respectively, as determined using the EzTaxon-e server [13]. Rhizobium sullae IS 123 T was isolated from a Hedysarum coronarium root nodule discovered in Southern Spain [14]. In contrast, R. leguminosarum USDA 2370 T was isolated from an effective nodule of Pisum sativum and is also able to nodulate Trifolium repens and Phaseolus vulgaris [15]. R. phaseoli ATCC 14482 T was originally isolated from nodules of Phaseolus vulgaris and has been shown to nodulate Trifolium repens, but not Pisum sativum [15].
Minimum Information about the Genome Sequence [16] of WSM1592 is provided in Table 1 and Additional file 1: Table S1.

Symbiotaxonomy
Hedysarum coronarium is a short-lived perennial pasture legume native to the Mediterranean basin and throughout the Hedysarum genus there is a large degree of specificity in symbiotic compatibility within this region [8]. Rhizobium sullae WSM1592 nodulates (Nod+) and fixes nitrogen effectively (Fix+) with Hedysarum coronarium. However, inoculation of H. spinosissimum, H. flexuosum and H. carnosum with WSM1592 results in mostly Nodbut always Fix-.

Genome project history
This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Genomic Encyclopedia of Bacteria and Archaea, The Root Nodulating Bacteria chapter project at the U.S. Department of Energy, Joint Genome Institute [17]. The genome project is deposited in the Genomes OnLine Database [18] and the high-quality permanent draft genome sequence in IMG [19]. Sequencing, finishing and annotation were performed by the JGI using state of the art sequencing technology [20]. A summary of the project information is shown in Table 2. Growth conditions and genomic DNA preparation R. sullae WSM1592 was cultured to mid logarithmic phase in 60 ml of TY rich media [11] on a gyratory shaker at 28°C. DNA was isolated from the cells using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method [21].

Genome sequencing and assembly
The draft genome of R. sullae strain WSM1592 was generated at the DOE Joint Genome Institute using state of the art technology [20]. An Illumina Std shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform which generated 29,255,624 reads. All general Fig. 2 Phylogenetic tree highlighting the position of R. sullae strain WSM1592 (shown in blue print) relative to other type and non-type rhizobia strains using a 901 bp internal region of the 16S rRNA gene. Bradyrhizobium elkanii ATCC 49852 T was used as an outgroup. All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [33]. The tree was built using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Strains with a genome sequencing project registered in GOLD [18] have the GOLD ID mentioned after the strain number and represented in bold, otherwise the NCBI accession number is provided. Finished genomes are designated with an asterisk aspects of library construction and sequencing performed at the JGI can be found at the JGI's web site [22]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts (Mingkun L, Copeland A, Han J. unpublished). Following steps were then performed for assembly: (1) filtered Illumina reads were assembled using Velvet (version 1.1.04) [23] (2) 1-3 Kbp simulated paired end reads were created from Velvet contigs using wgsim [24] (3) Illumina reads were assembled with simulated read pairs using Allpaths-LG (version r39750) [25].

Genome annotation
Genes were identified using Prodigal [26], as part of the DOE-JGI genome annotation pipeline [27,28]. The predicted CDSs were translated and used to search the National Centre for Biotechnology Information nonredundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [29] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [30]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using IN-FERNAL [31]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes-Expert Review system [32] developed by the Joint Genome Institute, Walnut Creek, CA, USA.

Genome properties
The genome is 7,530,820 nucleotides 59.87 % GC content (Table 3 and comprised of 118 scaffolds of 118 contigs. From a total of 7,526 genes, 7,453 were protein encoding and 73 RNA only encoding genes. The majority of genes (78.42 %) were assigned a putative function whilst the remaining genes were annotated as hypothetical. The distribution of genes into COG functional categories is presented in Table 4.

Conclusions
Rhizobium sullae WSM1592 was isolated from a root nodule of Hedysarum coronarium (also known as Sulla coronaria). Phylogenetic analysis revealed that WSM1592 is the most closely related to Hedysarum coronarium IS 123 T , which was also isolated from Hedysarum coronarium growing in Southern Spain. The genome of WSM1592 is the first to be described for a strain of Rhizobium sullae and is 7.5 Mbp, with a GC content of 59.87 %. As expected this genome contains the nitrogenase-RXN MetaCyc pathway characterized by the multiprotein nitrogenase complex and has been shown to fix effectively with Hedysarum coronarium. The genome attributes of WSM1592 will be important for the characterisation of the genetic determinants required for the establishment of an effective symbiosis with Hedysarum.

Additional file
Additional file 1: Table S1. Associated MIGS record for WSM1592.  The total is based on the total number of protein coding genes in the genome