Complete genome sequence of Salmonella enterica subspecies arizonae str. RKS2983
© Wang et al. 2015
Received: 29 September 2014
Accepted: 21 April 2015
Published: 3 June 2015
Salmonella arizonae (also called Salmonella subgroup IIIa) is a Gram-negative, non-spore-forming, motile, rod-shaped, facultatively anaerobic bacterium. S. arizonae strain RKS2983 was isolated from a human in California, USA. S. arizonae lies somewhere between Salmonella subgroups I (human pathogens) and V (also called S. bongori; usually non-pathogenic to humans) and so is an ideal model organism for studies of bacterial evolution from non-human pathogen to human pathogens. We hence sequenced the genome of RKS2983 for clues of genomic events that might have led to the divergence and speciation of Salmonella into distinct lineages with diverse host ranges and pathogenic features. The 4,574,836 bp complete genome contains 4,203 protein-coding genes, 82 tRNA genes and 7 rRNA operons. This genome contains several characteristics not reported to date in Salmonella subgroup I or V and may provide information about the genetic divergence of Salmonella pathogens.
KeywordsS. enterica subspecies arizonae RKS2983 Facultative anaerobe Genomic evolution Host-adapted Salmonella pathogenicity islands
Salmonella are Gram-negative facultative anaerobic bacteria of the family Enterobacteriaceae inhabiting the gastrointestinal tract of a wide variety of animals. There are currently over 2,600 serotypes (also called serovars) documented in the genus Salmonella . By chromosomal DNA hybridization experiments and MLEE, Salmonella currently are classified into two species,S. enterica and S. bongori (formerly subgroup V). The species S. enterica is further divided into six subspecies, including S. enterica subspecies enterica , S. enterica subspecies salamae , S. enterica subspecies arizonae , S. enterica subspecies diarizonae , S. enterica subspecies houtenae , and S. enterica subspecies indica , corresponding to the former subgroups I, II, IIIa, IIIb, IV and VI, respectively. Additionally, subgroup VII was described by Boyd et al. [1,2]. Salmonella taxonomy is a dynamic field of research and many issues remain unsolved, especially regarding species definition [3-5]. To avoid confusions, therefore, we use the traditional Salmonella classification system and the terms subgroup and serotype rather than subspecies or serovar (see more detailed explanation in ). Most of Salmonella infections in warm-blooded animals are caused by Salmonella subgroup I serotypes, and non-subgroup I serotypes are typically associated with cold-blooded vertebrates and rarely colonize the intestines of warm-blooded animals.
Salmonella evolved from a common ancestor with Escherichia coli about 120–150 million years ago [6,7]. During the evolutionary process, several key genomic events might have led bacteria to diverge, such as gene mutation and gene acquisition or loss . Importantly, numerous lines of evidence have indicated that gene acquisition and loss are the major force driving the evolution of virulence in Salmonella . In fact, it has been postulated that the evolution of Salmonella -specific virulence can be divided into three phases. The first phase is the split of Salmonella and E. coli by the Salmonella acquisition of Salmonella pathogenicity island 1, which is present in all lineages of Salmonella but absent from E. coli . SPI-1 encodes virulence factors that strengthen the infection of Salmonella serotypes by different mechanisms, including the invasiveness of the bacteria into intestinal epithelial cells , induction of neutrophil recruitment, and secretion of intestinal fluid [11-13]. The second phase is the divergence of Salmonella into S. bongori and S. enterica; this pathogenic lineage acquired SPI-2 [14-17], which contains genes encoding a type III secretion system that is required for survival in macrophages . The third phase is the adaptation of Salmonella subgroup I to warm-blooded animals, but the key genomic events involved remain unknown.
Genome sequencing efforts in Salmonella have mostly focused on Salmonella subgroup I serotypes, largely due to their pathogenicity in humans. In this study, we sequenced the genome of a strain from Salmonella subgroup IIIa (also known as Salmonella arizonae ), which lies somewhere between Salmonella subgroups I and V in evolution. Based on the important evolutionary position of Salmonella subgroup IIIa, we anticipated that its genomic comparisons with other Salmonella subgroups, especially subgroups I and V, may provide novel insights into the evolutionary transition of Salmonella adaptation from cold- to warm-blooded hosts.
Classification and features
Classification and general features of S. arizonae RKS2983
Evidence code a
Species Salmonella enterica
Subspecies Salmonella enterica subsp. arizonae
Sample collection time
We obtained RKS2983 from the Salmonella Genetic Stock Center (SGSC) as one of the strains in the set of Salmonella Reference Collection C strain (SARC6) ; it was initially isolated from a human of California in 1985. It is, like other Salmonella bacteria, Gram-negative with diameters around 0.7 to 1.5 μm and lengths of 2 to 5 μm, facultatively anaerobic, non-spore-forming, and predominantly motile with peritrichous flagella. The bacteria were grown at 37°C in Luria broth with pH of 7.2-7.6. Detailed information on the strain can be found at SGSC .
Genome sequencing information
Genome project history
Illumina Paired-End library and SOLiD mate_pair library (2 x 50 bp)
Illumina HiSeq 2000 and SOLiD 3.0
Gene calling method
Glimmer software that used in the RAST pipeline
Genbank date of release
September 22, 2014
Source material identifier
Evolution in bacteria
Growth conditions and DNA isolation
S. arizonae RKS2983 was cultured to mid-logarithmic phase in 50 ml of Luria Broth on a gyratory shaker at 37°C. DNA was isolated from the cells using a CTAB bacterial genomic DNA isolation method .
Genome sequencing and assembly
The genome of S. arizonae RKS2983 was sequenced by use of two sequencing platforms, SOLiD 3.0 and Illumina HiSeq 2000. First, genomic DNA was sequenced with the Illumina sequencing platform by the paired-end strategy (2×100 bp) and the details of library construction and sequencing can be found at the Illumina web site . The sequence data from Illumina HiSeq 2000 were assembled by SOAPdenovo v1.05 and the assembly contained 103 scaffolds with a genome size of 4.5 Mb. Then, the genomic DNA was sheared into 3 kb fragments by the Hydroshear instrument and was sequenced on a SOLiD sequencer by the mate-pair strategy (2 × 50 bp) according to the manual for the instrument (Applied Biosystems). The two sets of data from different methods were assembled by the velvet v1.2.09 software. The final assembly contained 20 scaffolds. Gaps between contigs were closed by PCR amplification using ABI3730 sequencer.
Genes were predicted by Rapid Annotation using Subsystem Technology  with Glimmer 3  followed by manual curation. The predicted coding sequences (CDSs) were translated and used to search the National Center for Biotechnology Information non-redundant database and Clusters of Orthologous Groups databases. These data sources were combined to assert a product description for each predicted protein. Then, we compared them with the annotated genes from four available Salmonella genomes, including S. typhi Ty2, S. typhimurium LT2 (AE006468) , S. arizonae RKS2980 (CP000880)  and S. bongori NCTC12419 (NC_015761) . Non-coding genes and miscellaneous features were predicted using tRNAscanSE , RNAMMer , Rfam  and TMHMM .
Nucleotide content and gene count levels of the genome
% of total a
Genome Size (bp)
G + C content (bp)
Coding region (bp)
Genes assigned to COGs
Number of genes associated with the 25 general COG functional categories
% of total a
Translation, ribosomal structure and biogenesis
RNA processing and modification
Replication, recombination and repair
Chromatin structure and dynamics
Cell cycle control, mitosis and meiosis
Signal transduction mechanisms
Cell wall/membrane biogenesis
Intracellular trafficking and secretion
Posttranslational modification, protein turnover, chaperones
Energy production and conversion
Carbohydrate transport and metabolism
Amino acid transport and metabolism
Nucleotide transport and metabolism
Coenzyme transport and metabolism
Lipid transport and metabolism
Inorganic ion transport and metabolism
Secondary metabolites biosynthesis, transport and catabolism
General function prediction only
Not in COGs
Insights from the genome sequence
Distribution of known SPIs in four representation genomes of Salmonella genus
S. bongori 12419
S. arizonae RKS2983
S. typhimurium LT2
S. typhi Ty2
S. arizonae is phylogenetically positioed between S. bongori and Salmonella subgroup I and shares some pathogenicity-associated genes with S. bongori and some others with Salmonella subgroup I lineages. Therefore S. arizonae genome analyses may provide important clues to key genomic events that might have facilitated the evolution of warm-blooded animal pathogens from cold-blooded parasites.
Salmonella reference collection C
Cetyl trimethyl ammonium bromide
Multilocus enzyme electrophoresis
This work was supported by a Heilongjiang Innovation Endowment Award for graduate studies to CXW (YJSCX2012-235HLJ) and to XYW (YJSCX2012-198HLJ); National Natural Science Foundation of China (NSFC30970078 and NSFC81201248) and a grant of Natural Science Foundation of Heilongjiang Province of China to GRL; and grants of the National Natural Science Foundation of China (NSFC30970119, 81030029, 81271786, NSFC-NIH 81161120416) to SLL.
- Brenner FW, Villar RG, Angulo FJ, Tauxe R, Swaminathan B. Salmonella nomenclature. J Clin Microbiol. 2000;38(7):2465–7.PubMed CentralPubMedGoogle Scholar
- Boyd EF, Wang FS, Whittam TS, Selander RK. Molecular genetic relationships of the salmonellae. Appl Environ Microbiol. 1996;62(3):804–8.PubMed CentralPubMedGoogle Scholar
- Tang L, Wang CX, Zhu SL, Li Y, Deng X, Johnston RN, et al. Genetic boundaries to delineate the typhoid agent and other Salmonella serotypes into distinct natural lineages. Genomics. 2013;102(4):331–7.View ArticlePubMedGoogle Scholar
- Tang L, Li Y, Deng X, Johnston RN, Liu GR, Liu SL. Defining natural species of bacteria: clear-cut genomic boundaries revealed by a turning point in nucleotide sequence divergence. BMC Genomics. 2013;14:489.View ArticlePubMed CentralPubMedGoogle Scholar
- Tang L, Liu SL. The 3Cs provide a novel concept of bacterial species: messages from the genome as illustrated by Salmonella. Antonie Van Leeuwenhoek. 2012;101(1):67–72.View ArticlePubMedGoogle Scholar
- Baumler AJ, Tsolis RM, Ficht TA, Adams LG. Evolution of host adaptation in Salmonella entswerica. Infect Immun. 1998;66(10):4579–87.PubMed CentralPubMedGoogle Scholar
- Doolittle RF, Feng DF, Tsang S, Cho G, Little E. Determining divergence times of the major kingdoms of living organisms with a protein clock. Science (New York, NY). 1996;271(5248):470–7.View ArticleGoogle Scholar
- Lawrence JG, Ochman H. Molecular archaeology of the Escherichia coli genome. Proc Natl Acad Sci U S A. 1998;95(16):9413–7.View ArticlePubMed CentralPubMedGoogle Scholar
- Porwollik S, McClelland M. Lateral gene transfer in Salmonella. Microbes and infection / Institut Pasteur. 2003;5(11):977–89.View ArticlePubMedGoogle Scholar
- Jones BD, Ghori N, Falkow S. Salmonella typhimurium initiates murine infection by penetrating and destroying the specialized epithelial M cells of the Peyer’s patches. J Exp Med. 1994;180(1):15–23.View ArticlePubMedGoogle Scholar
- McCormick BA, Miller SI, Carnes D, Madara JL. Transepithelial signaling to neutrophils by salmonellae: a novel virulence mechanism for gastroenteritis. Infect Immun. 1995;63(6):2302–9.PubMed CentralPubMedGoogle Scholar
- Schmidt H, Hensel M. Pathogenicity islands in bacterial pathogenesis. Clin Microbiol Rev. 2004;17(1):14–56.View ArticlePubMed CentralPubMedGoogle Scholar
- Ochman H, Groisman EA. Distribution of pathogenicity islands in Salmonella spp. Infect Immun. 1996;64(12):5410–2.PubMed CentralPubMedGoogle Scholar
- Hensel M, Shea JE, Baumler AJ, Gleeson C, Blattner F, Holden DW. Analysis of the boundaries of Salmonella pathogenicity island 2 and the corresponding chromosomal region of Escherichia coli K-12. J Bacteriol. 1997;179(4):1105–11.PubMed CentralPubMedGoogle Scholar
- Hensel M, Shea JE, Waterman SR, Mundy R, Nikolaus T, Banks G, et al. Genes encoding putative effector proteins of the type III secretion system of Salmonella pathogenicity island 2 are required for bacterial virulence and proliferation in macrophages. Mol Microbiol. 1998;30(1):163–74.View ArticlePubMedGoogle Scholar
- Cirillo DM, Valdivia RH, Monack DM, Falkow S. Macrophage-dependent induction of the Salmonella pathogenicity island 2 type III secretion system and its role in intracellular survival. Mol Microbiol. 1998;30(1):175–88.View ArticlePubMedGoogle Scholar
- Fookes M, Schroeder GN, Langridge GC, Blondel CJ, Mammina C, Connor TR, et al. Salmonella bongori provides insights into the evolution of the Salmonellae. PLoS Pathogens. 2011;7(8):e1002191.
- Ochman H, Soncini FC, Solomon F, Groisman EA. Identification of a pathogenicity island required for Salmonella survival in host cells. Proc Natl Acad Sci U S A. 1996;93(15):7800–4.View ArticlePubMed CentralPubMedGoogle Scholar
- Crosa JH, Brenner DJ, Ewing WH, Falkow S. Molecular relationships among the Salmonelleae. J Bacteriol. 1973;115(1):307–15.PubMed CentralPubMedGoogle Scholar
- SGSC web site. [www.ucalgary.ca/~kesander]
- Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol. 2008;26(5):541–7.View ArticlePubMed CentralPubMedGoogle Scholar
- Doyle JDJ. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem Bull. 1987;19:11–5.Google Scholar
- Illumina web site. [http://www.illumina.com/technology/sequencing_technology.ilmn]
- Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST Server: rapid annotations using subsystems technology. BMC Genomics. 2008;9:75.View ArticlePubMed CentralPubMedGoogle Scholar
- Delcher AL, Bratke KA, Powers EC, Salzberg SL. Identifying bacterial genes and endosymbiont DNA with glimmer. Bioinformatics (Oxford, England). 2007;23(6):673–9.View ArticleGoogle Scholar
- McClelland M, Sanderson KE, Spieth J, Clifton SW, Latreille P, Courtney L, et al. Complete genome sequence of Salmonella enterica serovar Typhimurium LT2. Nature. 2001;413(6858):852–6.View ArticlePubMedGoogle Scholar
- Desai PT, Porwollik S, Long F, Cheng P, Wollam A, Bhonagiri-Palsikar V, Hallsworth-Pepin K, Clifton SW, Weinstock GM, McClelland M: Evolutionary Genomics of Salmonella enterica Subspecies. mBio.2013;4(2):e00198-13.
- Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25(5):955–64.View ArticlePubMed CentralPubMedGoogle Scholar
- Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35(9):3100–8.View ArticlePubMed CentralPubMedGoogle Scholar
- Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. Rfam: an RNA family database. Nucleic Acids Res. 2003;31(1):439–41.View ArticlePubMed CentralPubMedGoogle Scholar
- Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001;305(3):567–80.View ArticlePubMedGoogle Scholar
- Katoh K, Misawa K, Kuma K, Miyata T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30(14):3059–66.View ArticlePubMed CentralPubMedGoogle Scholar
- Tamura K, Dudley J, Nei M, Kumar S. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 2007;24(8):1596–9.View ArticlePubMedGoogle Scholar
- Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A. 1990;87(12):4576–9.View ArticlePubMed CentralPubMedGoogle Scholar
- Garrity GM, Bell JA, Lilburn T. Phylum XIV. Proteobacteria phyL. In Bergey’s manual of systematic bacteriology, volume 2, part B. 2nd ed. New York: Springer; 2005:1.
- Garrity GM, Bell JA, Lilburn T. Class III. Gammaproteobacteria class.In Bergey’s manual of systematic bacteriology, volume 2, part B. 2nd ed. New York: Springer; 2005:1.
- Garrity GM, Holt JG. Taxonomic outline of the archaea and bacteria. In Bergey’s manual of systematic bacteriology, volume 1. 2nd ed. New York: Springer; 2001. p. 155–66.Google Scholar
- Rahn O. New principles for the classification of bacteria. Zentralbl Bakteriol Parasitenkd Infektionskr Hyg. 1937;96:273–86.Google Scholar
- Commission J. Conservation of the family name Enterobacteriaceae, of the name of the type genus, and designation of the type species Opinion No. 15. Int Bull Bacteriol Nomencl Taxon. 1958;8:74.Google Scholar
- Goullet P. Esterase electrophoretic pattern relatedness between Shigella species and Escherichia coli. J Gen Microbiol. 1980;117:493–500.PubMedGoogle Scholar
- Tindall BJ, Grimont PA, Garrity GM, Euzeby JP. Nomenclature and taxonomy of the genus Salmonella. Int J Syst Evol Microbiol. 2005;55(Pt 1):521–4.View ArticlePubMedGoogle Scholar
- Le Minor L, Popoff MY. Request for an opinion. Designation of Salmonella enterica sp. nov., nom. rev., as the type and only species of the genus Salmonella. Int J Syst Bacteriol. 1987;37:465–8.View ArticleGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet. 2000;25(1):25–9.View ArticlePubMed CentralPubMedGoogle Scholar