Genome sequence of Rhizobium leguminosarum bv trifolii strain WSM1689, the microsymbiont of the one flowered clover Trifolium uniflorum
© The Author(s) 2014
Published: 15 June 2014
Rhizobium leguminosarum bv. trifolii is a soil-inhabiting bacterium that has the capacity to be an effective N2-fixing microsymbiont of Trifolium (clover) species. R. leguminosarum bv. trifolii strain WSM1689 is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from a root nodule of Trifolium uniflorum collected on the edge of a valley 6 km from Eggares on the Greek Island of Naxos. Although WSM1689 is capable of highly effective N2-fixation with T. uniflorum, it is either unable to nodulate or unable to fix N2 with a wide range of both perennial and annual clovers originating from Europe, North America and Africa. WSM1689 therefore possesses a very narrow host range for effective N2 fixation and can thus play a valuable role in determining the geographic and phenological barriers to symbiotic performance in the genus Trifolium. Here we describe the features of R. leguminosarum bv. trifolii strain WSM1689, together with the complete genome sequence and its annotation. The 6,903,379 bp genome contains 6,709 protein-coding genes and 89 RNA-only encoding genes. This multipartite genome contains six distinct replicons; a chromosome of size 4,854,518 bp and five plasmids of size 667,306, 518,052, 341,391, 262,704 and 259,408 bp. This rhizobial genome is one of 20 sequenced as part of a DOE Joint Genome Institute 2010 Community Sequencing Program.
Keywordsroot-nodule bacteria nitrogen fixation lupin-nodulating rhizobia Alphaproteobacteria
The nitrogen (N) cycle is one of the most important biogeochemical processes underpinning the existence of life on Earth. A key step in this cycle is to convert relatively inert atmospheric dinitrogen (N2) into a bioaccessible form such as ammonia (NH3) through a process referred to as biological nitrogen fixation (BNF). BNF is performed only by a specialized subset of Bacteria and Archaea that possess the necessary cellular machinery to enzymatically reduce N2 into NH3. Some of these bacteria (termed rhizobia or root nodule bacteria) have evolved non-obligatory symbiotic relationships with legumes whereby the bacteria receive a carbon source from the plant and in return supply fixed N to the host . Harnessing this association can boost soil N-inputs and therefore production yields of legumes, or non-legumes grown in subsequent years, without the need for supplementation with industrially synthesized N-based fertilizers .
Some of the most widely cultivated pasture legumes are members of the legume genus Trifolium (clover). The natural distribution of these species spans three centers of diversity, with an estimated 28% of species in the Americas, 57% in Eurasia and 15% in sub-Saharan Africa . Approximately 30 species of clover, predominately of Eurasian origin, are widely grown as annual and perennial species in pasture systems in Mediterranean and temperate climatic zones . Globally-important perennial species of clover include T. repens (white clover), T. pratense (red clover), T. fragiferum (strawberry clover) and T. hybridum (alsike clover). While clovers are known to form N2-fixing symbiotic associations with Rhizobium leguminosarum bv. trifolii, there exists wide variation in symbiotic compatibility across different strains and hosts from ineffective (non-N2-fixing) nodulation to fully effective N2-fixing partnerships.
Rhizobium leguminosarum bv. trifolii strain WSM1689 was isolated in 1995 from a nodule of the perennial clover Trifolium uniflorum collected on the edge of a valley 6 km from Eggares on the Greek Island of Naxos. T. uniflorum is one of small number of perennial Trifolium spp. found in the dry, Mediterranean basin. While WSM1689 has been shown to be either ineffective or unable to nodulate a range of annual and perennial Trifolium sp., it is a highly effective N2-fixing microsymbiont of T. uniflorum . Therefore, R. leguminosarum bv. trifolii WSM1689 has a very narrow host range and thus represents a good isolate to study the genetic basis of symbiotic specificity. The availability of this sequence data also complements the already published genomes of the clover-nodulating R. leguminosarum bv. trifolii WSM1325  and WSM2304 . Here we present a summary classification and a set of general features for R. leguminosarum bv. trifolii strain WSM1689 together with the description of the complete genome sequence and its annotation.
Classification and features
Species Rhizobium leguminosarum bv. trifolii
Soil, root nodule, host
Free living, symbiotic
Nodule collection date
Compatibility of WSM1689 with both perennial and annual Trifolium genotypes for nodulation (Nod) and N2-Fixation (Fix). Data compiled from .
Russian no 9
Genome sequencing and annotation
Genome project history
Genome sequencing project information for Rhizobium leguminosarum bv. trifolii strain WSM1689.
Illumina GAii shotgun and paired end 454 libraries
Illumina GAii and 454 GS FLX Titanium technologies
8.3x 454, 774.6x Illumina
VELVET, version 1.1.05; Newbler, version 2.6; phrap, version SPS - 4.24
Gene calling methods
Prodigal 1.4, GenePRIMP
Not yet available
Genbank Date of Release
Not yet released
NCBI project ID
Symbiotic nitrogen fixation, agriculture
Growth conditions and DNA isolation
Rhizobium leguminosarum bv. trifolii strain WSM1689 was grown to mid logarithmic phase in TY rich medium on a gyratory shaker at 28°C . DNA was isolated from 60 mL of cells using a CTAB (Cetyl trimethyl ammonium bromide) bacterial genomic DNA isolation method .
Genome sequencing and assembly
The genome of Rhizobium leguminosarum bv. trifolii strain WSM1689 was sequenced at the Joint Genome Institute (JGI) using a combination of Illumina  and 454 technologies . An Illumina GAii shotgun library which generated 73,565,648 reads totaling 5,591 Mbp, and a paired end 454 library with an average insert size of 12 Kbp which generated 376,185 reads totaling 93.4 Mbp of 454 data were generated for this genome. All general aspects of library construction and sequencing performed at the JGI can be found at . The initial draft assembly contained 100 contigs in 4 scaffolds. The 454 paired end data was assembled with Newbler, version 2.6. The Newbler consensus sequences were computationally shredded into 2 Kbp overlapping fake reads (shreds). Illumina sequencing data was assembled with VELVET, version 1.1.05 , and the consensus sequence computationally shredded into 1.5 Kbp overlapping fake reads (shreds). We integrated the 454 Newbler consensus shreds, the Illumina VELVET consensus shreds and the read pairs in the 454 paired end library using parallel phrap, version SPS - 4.24 (High Performance Software, LLC). The software Consed [34–36] was used in the following finishing process. Illumina data was used to correct potential base errors and increase consensus quality using the software Polisher developed at JGI (Alla Lapidus, unpublished). Possible mis-assemblies were corrected using gapResolution (Cliff Han, unpublished), Dupfinisher , or sequencing cloned bridging PCR fragments with subcloning. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR (J-F Cheng, unpublished) primer walks. A total of 93 additional reactions were necessary to close gaps and to raise the quality of the finished sequence. The total genome size is 6.9 Mbp and the final assembly is based on 57.3 Mbp of 454 draft data which provides an average 8.3× coverage of the genome and 5,345 Mbp of Illumina draft data which provides an average 774.6× coverage of the genome.
Genes were identified using Prodigal  as part of the DOE-JGI genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline . The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE , RNAMMer , Rfam , TMHMM , and SignalP . Additional gene prediction analyses and functional annotation were performed within the Integrated Microbial Genomes (IMG-ER) platform [45,46].
Genome Statistics for Rhizobium leguminosarum bv. trifolii strain WSM1689.
% of Total
Genome size (bp)
DNA coding region (bp)
DNA G+C content (bp)
Number of replicons
Genes with function prediction
Genes assigned to COGs
Genes assigned Pfam domains
Genes with signal peptides
Genes coding transmembrane proteins
Number of protein coding genes of Rhizobium leguminosarum bv. trifolii strain WSM1689 associated with the general COG functional categories.
Translation, ribosomal structure and biogenesis
RNA processing and modification
Replication, recombination and repair
Chromatin structure and dynamics
Cell cycle control, mitosis and meiosis
Signal transduction mechanisms
Cell wall/membrane biogenesis
Intracellular trafficking and secretion
Posttranslational modification, protein turnover, chaperones
Energy production conversion
Carbohydrate transport and metabolism
Amino acid transport metabolism
Nucleotide transport and metabolism
Coenzyme transport and metabolism
Lipid transport and metabolism
Inorganic ion transport and metabolism
Secondary metabolite biosynthesis, transport and catabolism
General function prediction only
Not in COGS
This work was performed under the auspices of the US Department of Energy’s Office of Science, Biological and Environmental Research Program, and by the University of California, Lawrence Berkeley National Laboratory under contract No. DE-AC02-05CH11231, Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344, and Los Alamos National Laboratory under contract No. DE-AC02-06NA25396. We gratefully acknowledge the funding received from the Murdoch University Sir Walter Murdoch Adjunct Professor Scheme for Professor Philip Poole.
- Terpolilli JJ, Hood GA, Poole PS. What determines the efficiency of N2-fixing Rhizobium-Legume symbioses? Adv Microb Physiol 2012; 60:325–389. PubMed http://dx.doi.org/10.1016/B978-0-12-398264-3.00005-XView ArticlePubMedGoogle Scholar
- Howieson JG, O’Hara GW, Carr SJ. Changing roles for legumes in Mediterranean agriculture: developments from an Australian perspective. Field Crops Res 2000; 65:107–122. http://dx.doi.org/10.1016/S0378-4290(99)00081-7View ArticleGoogle Scholar
- Lamont EJ, Zoghlami A, Hamilton RS, Bennett SJ. Clovers (Trifolium L.). In: Maxted N, Bennett SJ, editors. Plant Genetic Resources of Legumes in the Mediterranean. Dordrecht: Kluwer Academic Publishers; 2001. p 79–98.View ArticleGoogle Scholar
- Howieson J, Yates R, O’Hara G, Ryder M, Real D. The interactions of Rhizobium leguminosarum biovar trifolii in nodulation of annual and perennial Trifolium spp from diverse centres of origin. Aust J Exp Agric 2005; 45:199–207. http://dx.doi.org/10.1071/EA03167View ArticleGoogle Scholar
- Reeve W, O’Hara G, Chain P, Ardley J, Brau L, Nandesena K, Tiwari R, Copeland A, Nolan M, Han C, et al. Complete genome sequence of Rhizobium leguminosarum bv. trifolii strain WSM1325, an effective microsymbiont of annual Mediterranean clovers. Stand Genomic Sci 2010; 2:347–356. PubMed http://dx.doi.org/10.4056/sigs.852027PubMed CentralView ArticlePubMedGoogle Scholar
- Reeve W, O’Hara G, Chain P, Ardley J, Brau L, Nandesena K, Tiwari R, Malfatti S, Kiss H, Lapidus A, et al. Complete genome sequence of Rhizobium leguminosarum bv trifolii strain WSM2304, an effective microsymbiont of the South American clover Trifolium polymorphum. Stand Genomic Sci 2010; 2:66–76. PubMed http://dx.doi.org/10.4056/sigs.44642PubMed CentralView ArticlePubMedGoogle Scholar
- Howieson JG, Ewing MA, D’antuono MF. Selection for acid tolerance in Rhizobium meliloti. Plant Soil 1988; 105:179–188. http://dx.doi.org/10.1007/BF02376781View ArticleGoogle Scholar
- Beringer JE. R factor transfer in Rhizobium leguminosarum. J Gen Microbiol 1974; 84:188–198. PubMed http://dx.doi.org/10.1099/00221287-84-1-188PubMedGoogle Scholar
- Terpolilli JJ. Why are the symbioses between some genotypes of Sinorhizobium and Medicago suboptimal for N2 fixation? Perth: Murdoch University; 2009. 223 p.Google Scholar
- Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen M, Angiuoli SV, et al. Towards a richer description of our complete collection of genomes and metagenomes “Minimum Information about a Genome Sequence” (MIGS) specification. Nat Biotechnol 2008; 26:541–547. PubMed http://dx.doi.org/10.1038/nbt1360PubMed CentralView ArticlePubMedGoogle Scholar
- Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA 1990; 87:4576–4579. PubMed http://dx.doi.org/10.1073/pnas.87.12.4576PubMed CentralView ArticlePubMedGoogle Scholar
- Garrity GM, Bell JA, Lilburn T. Phylum XIV. Proteobacteria phyl. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 2, Part B, Springer, New York, 2005, p. 1.View ArticleGoogle Scholar
- Garrity GM, Bell JA, Lilburn T. Class I. Alphaproteobacteria class. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 2, Part C, Springer, New York, 2005, p. 1.View ArticleGoogle Scholar
- Validation List No. 107. List of new names and new combinations previously effectively, but not validly, published. Int J Syst Evol Microbiol 2006; 56:1–6. PubMed http://dx.doi.org/10.1099/ijs.0.64188-0
- Kuykendall LD. Order VI. Rhizobiales ord. nov. In: Garrity GM, Brenner DJ, Kreig NR, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology. Second ed: New York: Springer-Verlag; 2005. p 324.Google Scholar
- Skerman VBD, McGowan V, Sneath PHA. Approved Lists of Bacterial Names. Int J Syst Bacteriol 1980; 30:225–420. http://dx.doi.org/10.1099/00207713-30-1-225View ArticleGoogle Scholar
- Conn HJ. Taxonomic relationships of certain non-sporeforming rods in soil. J Bacteriol 1938; 36:320–321.Google Scholar
- Frank B. Über die Pilzsymbiose der Leguminosen. Ber Dtsch Bot Ges 1889; 7:332–346.Google Scholar
- Jordan DC, Allen ON. Genus I. Rhizobium Frank 1889, 338; Nom. gen. cons. Opin. 34, Jud. Comm. 1970, 11. In: Buchanan RE, Gibbons NE (eds), Bergey’s Manual of Determinative Bacteriology, Eighth Edition, The Williams and Wilkins Co., Baltimore, 1974, p. 262–264.Google Scholar
- Young JM, Kuykendall LD, Martínez-Romero E, Kerr A, Sawada H. A revision of Rhizobium Frank 1889, with an emended description of the genus, and the inclusion of all species of Agrobacterium Conn 1942 and Allorhizobium undicola de Lajudie et al. 1998 as new combinations: Rhizobium radiobacter, R. rhizogenes, R. rubi, R. undicola and R. vitis. Int J Syst Evol Microbiol 2001; 51:89–103. PubMedView ArticlePubMedGoogle Scholar
- Editorial Secretary (for the Judicial Commission of the International Committee on Nomenclature of Bacteria). OPINION 34: Conservation of the Generic Name Rhizobium Frank 1889. Int J Syst Bacteriol 1970; 20:11–12. http://dx.doi.org/10.1099/00207713-20-1-11View ArticleGoogle Scholar
- Ramíez-Bahena MH, García-Fraile P, Peix A, Valverde A, Rivas R, Igual JM, Mateos PF, Martínez-Molina E, Velázquez E. Revision of the taxonomic status of the species Rhizobium leguminosarum (Frank 1879) Frank 1889AL, Rhizobium phaseoli Dangeard 1926AL and Rhizobium trifolii Dangeard 1926AL.R. trifolii is a later synonym of R. leguminosarum. Reclassification of the strain R. leguminosarum DSM 30132 (=NCIMB 11478) as Rhizobium pisi sp. nov. Int J Syst Evol Microbiol 2008; 58:2484–2490. PubMed http://dx.doi.org/10.1099/ijs.0.65621-0View ArticleGoogle Scholar
- Agents B. Technical rules for biological agents. TRBA (http://www.baua.de):466.
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25–29. PubMed http://dx.doi.org/10.1038/75556PubMed CentralView ArticlePubMedGoogle Scholar
- Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Mol Biol Evol 2011; 28:2731–2739. PubMed http://dx.doi.org/10.1093/molbev/msr121PubMed CentralView ArticlePubMedGoogle Scholar
- Nei M, Kumar S. Molecular Evolution and Phylogenetics. New York: Oxford University Press; 2000.Google Scholar
- Felsenstein J. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 1985; 39:783–791. http://dx.doi.org/10.2307/2408678View ArticleGoogle Scholar
- Liolios K, Mavromatis K, Tavernarakis N, Kyrpides NC. The Genomes On Line Database (GOLD) in 2007: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2008; 36:D475–D479. PubMed http://dx.doi.org/10.1093/nar/gkm884PubMed CentralView ArticlePubMedGoogle Scholar
- Reeve WG, Tiwari RP, Worsley PS, Dilworth MJ, Glenn AR, Howieson JG. Constructs for insertional mutagenesis, transcriptional signal localization and gene regulation studies in root nodule and other bacteria. Microbiology 1999; 145:1307–1316. PubMed http://dx.doi.org/10.1099/13500872-145-6-1307View ArticlePubMedGoogle Scholar
- DOE Joint Genome Institute. http://my.jgi.doe.gov/general/index.html
- Bennett S. Solexa Ltd. Pharmacogenomics 2004; 5:433–438. PubMed http://dx.doi.org/10.1517/14622422.214.171.1243View ArticlePubMedGoogle Scholar
- Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005; 437:376–380. PubMedPubMed CentralPubMedGoogle Scholar
- Zerbino DR. Using the Velvet de novo assembler for short-read sequencing technologies. Current Protocols in Bioinformatics 2010;Chapter 11:Unit 11 5.
- Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 1998; 8:186–194. PubMed http://dx.doi.org/10.1101/gr.8.3.175View ArticlePubMedGoogle Scholar
- Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 1998; 8:175–185. PubMed http://dx.doi.org/10.1101/gr.8.3.175View ArticlePubMedGoogle Scholar
- Gordon D, Abajian C, Green P. Consed: a graphical tool for sequence finishing. Genome Res 1998; 8:195–202. PubMed http://dx.doi.org/10.1101/gr.8.3.195View ArticlePubMedGoogle Scholar
- Han C, Chain P. Finishing repeat regions automatically with Dupfinisher. In: Valafar HRAH, editor. Proceeding of the 2006 international conference on bioinformatics & computational biology: CSREA Press; 2006. p 141–146.
- Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11:119. PubMed http://dx.doi.org/10.1186/1471-2105-11-119PubMed CentralView ArticlePubMedGoogle Scholar
- Pati A, Ivanova NN, Mikhailova N, Ovchinnikova G, Hooper SD, Lykidis A, Kyrpides NC. GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes. Nat Methods 2010; 7:455–457. PubMed http://dx.doi.org/10.1038/nmeth.1457View ArticlePubMedGoogle Scholar
- Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997; 25:955–964. PubMedPubMed CentralView ArticlePubMedGoogle Scholar
- Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 2007; 35:3100–3108. PubMed http://dx.doi.org/10.1093/nar/gkm160PubMed CentralView ArticlePubMedGoogle Scholar
- Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res 2003; 31:439–441. PubMed http://dx.doi.org/10.1093/nar/gkg006PubMed CentralView ArticlePubMedGoogle Scholar
- Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 2001; 305:567–580. PubMed http://dx.doi.org/10.1006/jmbi.2000.4315View ArticlePubMedGoogle Scholar
- Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 2004; 340:783–795. PubMed http://dx.doi.org/10.1016/j.jmb.2004.05.028View ArticlePubMedGoogle Scholar
- Markowitz VM, Mavromatis K, Ivanova NN, Chen IM, Chu K, Kyrpides NC. IMG ER: a system for microbial genome annotation expert review and curation. Bioinformatics 2009; 25:2271–2278. PubMed http://dx.doi.org/10.1093/bioinformatics/btp393View ArticlePubMedGoogle Scholar
- Integrated Microbial Genomes (IMG-ER) platform. http://img.jgi.doe.gov/er