High-quality permanent draft genome sequence of the Lebeckia - nodulating Burkholderia dilworthii strain WSM3556T

Burkholderia dilworthii strain WSM3556T is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from an effective N2-fixing root nodule of Lebeckia ambigua collected near Grotto Bay Nature Reserve, in the Western Cape of South Africa, in October 2004. This plant persists in infertile and deep sandy soils with acidic pH, and is therefore an ideal candidate for a perennial based agriculture system in Western Australia. WSM3556T thus represents a potential inoculant quality strain for L. ambigua for which we describe the general features, together with genome sequence and annotation. The 7,679,067 bp high-quality permanent draft genome is arranged in 140 scaffolds of 141 contigs, contains 7,059 protein-coding genes and 64 RNA-only encoding genes, and is part of the GEBA-RNB project proposal.


Introduction
Over the last decade, agricultural scientists have sought to discover perennial legumes from a wide range of natural environments to develop new plants for grazing systems [1]. It is thought that these plants might be more resilient to changing rainfall patterns, such as in the target environments of Western Australia. Here, winter rainfall has declined by 20 % in the last two decades [2], although more frequent summer rainfall events have been experienced. In the fynbos biome of South Africa, several species that offer potential for domestication have been discovered [1,3]. These legumes are frequently nodulated by Burkholderia bacteria in the class Betaproteobacteria [3,4]. The symbiosis between these Burkholderia and legumes from the genera Lebeckia and Rhynchosia fix atmospheric nitrogen to enable their cultivation on infertile soils [4][5][6][7]. Lebeckia ambigua is proving well adapted to Western Australia [1] because in areas where it is naturally found in South Africa the soil and climatic conditions approximate those of Western Australia.
Nodules and seeds of L. ambigua were collected in four expeditions to the Western Cape of South Africa between 2002 and 2007. The isolation of bacteria from these nodules gave rise to a collection of 23 strains that were identified as Burkholderia [3]. Unlike most of the previously studied nodulating Burkholderia strains, this South African group appears to associate with papilionoid forage legumes, rather than Mimosa species. WSM3556 T belongs to a subgroup of strains that were isolated in 2004 from nodules collected south west of Darling, in a natural rangeland site on the southern border of the Grotto Bay Nature Reserve [3]. The soil at the site of collection was deep sand with a pH of 6. Burkholderia dilworthii strain WSM3556 T was isolated from those nodules and is effective at fixing nitrogen with L. ambigua and L. sepiaria. The nodules formed by these symbioses are crotaloid and indeterminate [3].
WSM3556 T thus represents a potential inoculant quality strain for L. ambigua, which is being developed as a grazing legume adapted to infertile soils that receive 250-400 mm annual rainfall in southern Australia and is therefore of special interest to the RNB chapter of the GEBA project. Here we present a summary classification and a set of general features for Burkholderia dilworthii strain WSM3556 T together with the description of the permanent draft genome sequence and annotation.

Classification and features
Burkholderia dilworthii strain WSM3556 T is a motile, Gram-negative, non-spore-forming rod ( Fig. 1 Left, Center) in the order Burkholderiales of the class Betaproteobacteria. The rod-shaped form varies in size with dimensions of 0.9-2 μm in width and 0.4-3.0 μm in length ( Fig. 1 Left). It is fast growing, forming 0.4-2 mm diameter colonies after 24 h when grown on half Lupin Agar [8] and TY [9] at 28°C. Colonies on ½LA are whiteopaque, slightly domed, moderately mucoid with smooth margins (Fig. 1 Right). Additional physiological properties of this strain were previously published [5]. Figure 2 shows the phylogenetic relationship of Burkholderia dilworthii strain WSM3556 T in a 16S rRNA gene sequence based tree. This strain is most similar to Burkholderia rhynchosiae WSM3937 T and Burkholderia phytofirmans PsJN T based on the 16S rRNA with sequence identities of 98.50 % and 98.11 %, respectively, as determined using the EzTaxon-e server [10]. Burkholderia rhynchosiae WSM3937 T has been isolated Fig. 1 Images of Burkholderia dilworthii strain WSM3556 T using scanning (Left) and transmission (Center) electron microscopy and the appearance of colony morphology on solid media (Right) Fig. 2 Phylogenetic tree highlighting the position of Burkholderia dilworthii strain WSM3556 T (shown in blue print), relative to other strains in the Burkholderia genus using a 1,322 bp internal region of the 16S rRNA gene. Cupriavidus taiwanensis LMG 19424 T was used as an outgroup. All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 5.05 [31]. The tree was build using the maximum likelihood method with the General Time Reversible model. Bootstrap analysis with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Strains with a genome sequencing project registered in GOLD [14] are in bold print and the GOLD ID is provided after the NCBI accession number. Published genomes are designated with an asterisk from Rhynchosia ferulifolia, a herbaceous legume from the fynbos biome in South Africa [7]. Burkholderia phytofirmans PsJN T was isolated from surface sterilized onion roots and has plant growth promoting properties on various plants, however it has not been reported in association with legumes [11]. Minimum Information about the Genome Sequence of WSM3556 T is provided in Table 1.

Genome project history
This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Genomic Encyclopedia of Bacteria and Archaea, The Root Nodulating Bacteria chapter project at the U.S. Department of Energy, Joint Genome Institute for projects of relevance to agency missions [13]. The genome project is deposited in the Genomes OnLine Database [14] and the high-quality permanent draft genome sequence in IMG [15]. Sequencing, finishing and annotation were performed by the JGI using state of the art sequencing technology [16]. A summary of the project information is shown in Table 2.

MIGS-4.4 Altitude 237 IDA
Evidence codes -IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [41] Growth conditions and genomic DNA preparation Burkholderia dilworthii strain WSM3556 T was grown on TY solid medium [9] for 3 days, a single colony was selected and used to inoculate 5 ml TY broth medium. The culture was grown for 48 h on a gyratory shaker (200 rpm) at 28°C. Subsequently 1 ml was used to inoculate 60 ml TY broth medium and grown on a gyratory shaker (200 rpm) at 28°C until OD 0.6 was reached. DNA was isolated from 60 mL of cells using a CTAB bacterial genomic DNA isolation method [17]. Final concentration of the DNA was 0.5 mg/ml.

Genome sequencing and assembly
The genome of Burkholderia dilworthii strain WSM3556 T was sequenced at the DOE Joint Genome Institute using state of the art technology [18]. For this genome, an Illumina standard shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform, which generated 9,394,768 reads totalling 2,818.4 Mbp of Illumina data. All general aspects of library construction and sequencing performed at the JGI can be found on the JGI web site [16]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts (Mingkun L, Copeland A, Han J. unpublished). The following steps were then performed for assembly: (1) filtered Illumina reads were assembled using Velvet, version 1.1.04 [19], (2)

Genome annotation
Genes were identified using Prodigal [22], as part of the DOE-JGI genome annotation pipeline [23,24] followed by a round of manual curation using GenePRIMP [25] for finished genomes and Draft genomes in fewer than 10 scaffolds. The predicted CDSs were translated and   [26] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [27]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [28]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes-Expert Review system [29] developed by the Joint Genome Institute, Walnut Creek, CA, USA.

Genome properties
The genome is 7,679,067 nucleotides with 61.77 % GC content ( Table 3) and comprised of 140 scaffolds and 141 contigs. From a total of 7,123 genes, 7,059 were protein encoding and 64 RNA only encoding genes. The majority of genes (76.25 %) were assigned a putative function whilst the remaining genes were annotated as hypothetical. The distribution of genes into COG functional categories is presented in Table 4.

Conclusion
Burkholderia dilworthii WSM3556 T belongs to a group of Beta-rhizobia isolated from Lebeckia ambigua from the fynbos biome in South Africa [3]. WSM3556 T is phylogeneticaly most closely related to Burkholderia rhynchosiae WSM3937 T and Burkholderia phytofirmans PsJN T . Of these strains only WSM3556 T and WSM3937 T are legume microsymbionts. Out of 13 Burkholderia strains that are known legume microsymbionts, only four (WSM3556 T , WSM4176, WSM5005 T , STM678 T ) nodulate South African papilionoid species. A comparison of these nodulating strains reveals that WSM3556 T has the smallest genome (7.7 Mbp), the smallest KOG count (1295) and the lowest GC (61.77 %) percentage in this group. These four genomes share the nitrogenase-RXN MetaCyc pathway catalyzed by a multiprotein nitrogenase complex. Strains WSM3556 T , WSM4176, WSM5005 T [30] have been shown to fix nitrogen with Lebeckia ambigua provenances with varying degrees of effectiveness. WSM3556 T is partially effective on two out of three L. ambigua provenances, WSM4176 is partially effective on only one L. ambigua provenance and WSM5005 T is effective on all three L. ambigua provenances. The genome sequences of these fynbos bacteria provides an unprecedented opportunity to reveal the genetic determinants required for effective nitrogen fixation with Lebeckia.