Genome sequences of two closely related strains of Escherichia coli K-12 GM4792
© Zhang et al. 2015
Received: 5 June 2015
Accepted: 9 November 2015
Published: 10 December 2015
Escherichia coli lab strains K-12 GM4792 Lac+ and GM4792 Lac- carry opposite lactose markers, which are useful for distinguishing evolved lines as they produce different colored colonies. The two closely related strains are chosen as ancestors for our ongoing studies of experimental evolution. Here, we describe the genome sequences, annotation, and features of GM4792 Lac+ and GM4792 Lac-. GM4792 Lac+ has a 4,622,342-bp long chromosome with 4,061 protein-coding genes and 83 RNA genes. Similarly, the genome of GM4792 Lac- consists of a 4,621,656-bp chromosome containing 4,043 protein-coding genes and 74 RNA genes. Genome comparison analysis reveals that the differences between GM4792 Lac+ and GM4792 Lac- are minimal and limited to only the targeted lac region. Moreover, a previous study on competitive experimentation indicates the two strains are identical or nearly identical in survivability except for lactose utilization in a nitrogen-limited environment. Therefore, at both a genetic and a phenotypic level, GM4792 Lac+ and GM4792 Lac-, with opposite neutral markers, are ideal systems for future experimental evolution studies.
KeywordsEscherichia coli K12 GM4792 Lactose Gram-negative Genome comparison Experimental evolution Variant analysis
The microbial experimental evolution systems, with the ability to generate a ‘fossil’ record for later study and the design of replicate populations to test the predictability of evolution, offer a chance to ‘replay’ the evolutionary process, ‘watch’ evolution in action  and measure the fitness of evolved lines under the relevant environmental conditions . However, the lack of obvious differences in phenotypic characteristics makes microbes difficult to observe. Fortunately, some neutral genetic markers help distinguish evolved lines by differences in colony color . Typically, when a derived strain with an opposite marker relative to its progenitor is required, one can be selected using specific culture media . Subsequently, the degree of neutrality for this marker is evaluated by comparing the fitness of the two strains containing opposite markers under the culture conditions used in the study . The lactose marker is one such marker. For the lac operon, a previous study has been performed utilizing its mutations between strains with opposite lactose markers via target sequencing .
Since the publication of the K-12 genome in 1977 , Escherichia coli has been thoroughly studied with regard to its genetics [7–9], biochemistry [10–12], metabolic reconstruction , pathway inference , genomics [14–16] and metabolic . E. coli strain K-12 GM4792, a laboratory strain, contains the chromosomal lacI33::lacZ allele and is unable to utilize lactose . GM4792 was a derivative of the parent strain P90C [ara-600 del(gpt-lac)5 LAM- relA1 spoT1 thiE1] [19–21] by homogenizing a Pro+ Lac+/F' lacI33::lacZ and then curing the episome with acridine orange  (M. G. Marinus, personal communication). A previous study  resulted in two closely related strains, GM4792 Lac- and GM4792 Lac+ that carry opposite lactose markers and plasmids are knocked out for further studies on experimental evolution. Here, Lac+ refers to the ability of the strain to utilize lactose and Lac- refers to the inability to utilize lactose. These strains were chosen as ancestors for our ongoing studies of the experimental evolution of E. coli in a nitrogen-limited environment. In this study, we summarize the classification and features of E. coli GM4792 Lac+ and GM4792 Lac-, together with a description of the genome sequencing and annotation. This work provides a foundation for future variant analysis of evolved lines at the genomic scale. To compare GM4792 Lac+ and GM4792 Lac-, we used the breseq pipeline v0.20  to detect initial variants and subsequently applied a series of filters to eliminate false positives. Using this method, two significant variants were detected, including a synonymous single nucleotide polymorphism, and a 1-bp deletion responsible for lactose metabolism. A previous study on competitive experimentation  has shown that these two strains are identical or nearly identical in survivability, except for lactose utilization in a nitrogen-limited environment. Thus, both genetically and phenotypically, GM4792 Lac+ and GM4792 Lac- carry neutral markers and are appropriate for future experimental evolution studies.
Classification and features
Classification and general features of Escherichia coli strain K-12 GM4792 according to the MIGS recommendations 
Species Escherichia coli
IDA, TAS 
IDA, TAS 
10 °C ~ 45 °C
IDA, TAS 
pH range; Optimum
IDA, TAS 
October 7, 2007
As a model organism, the molecular structure and chemical composition of the cell wall of E. coli have been thoroughly studied. This is described in detail by Scheutz and Strockbine . Similar to other strains of E. coli , GM4792 has a single peptidoglycan layer within the periplasm, consisting of D-glutamic acid, D-alanine, mesodiaminopimelic acid, N-acetyglucosamine and N-acetylmuramic acid linked to the tetrapeptide L-alanine. The cells stain Gram-negative and contain an outer membrane, with a lipopolysaccharide layer containing lipid A, the core region of the phosphorylated nonrepeating oligosaccharides and the O-antigen polymer [7, 25, 26].
Genome sequencing information
Genome project history
Two paired-end libraries of 180 bp, 380 bp and two mate-pair libraries of 2,000 bp, 6,000 bp, respectively
Illumina HiSeq 2000
~330× for GM4792 Lac+ and ~370× for GM4792 Lac- (180 bp); ~100x (other libraries)
ALLPATHS-LG Release 42411 
Gene calling method
RATT, Prodigal v2.5 
U068 for Lac+ and U069 for Lac-
CP011342 for Lac+ and CP011343 for Lac-
GenBank Date of Release
Jun 6, 2015
Gi0059689 for GM4792 Lac+ and Gi0059688 for GM4792 Lac-
PRJNA224130 for GM4792 Lac+ and PRJNA224131 for GM4792 Lac-
GM4792 Lac+ : SRR2596368, SRR2537294,
GM4792 Lac- : SRR2529478, SRR1039666,
Source Material Identifier
Experimental evolution, Tree of Life
Growth conditions and genomic DNA preparation
After receiving the laboratory strain GM4792 from M. G. Marinus, a single clone was randomly selected as a Lac- strain. A single Lac+ clone was obtained after the Lac- strain had been incubated for 4 days under selection conditions for lactose metabolism. Strains stored at –40 °C were thawed at room temperature. Each strain was streaked on LB solid medium with an inoculation needle and incubated for 24 h at 37 °C. Distinctive monoclonal colonies grew, and a single colony was selected and inoculated into 5 ml LB liquid medium and grown at 37 °C with shaking for 24 h. Total genomic DNA was extracted using the TIANamp Bacteria DNA Kit (Code:DP302, TIANGEN BIOTECH, Beijing, China), according to the manufacturer’s instructions. Additional RNaseA (Code:RT405-12, TIANGEN BIOTECH CO, Beijing, China) was added, following the manufacturer’s instruction. The quality and quantity of the genomic DNA was evaluated using agarose gel electrophoresis and the λ-Hind III digest DNA Marker (Code:D3403A, TaKaRa, China). For each sample, approximately 3 μg DNA with a concentration of 100 ng/μl was obtained.
Genome sequencing and assembly
Whole-genome sequencing was performed using the Illumina HiSeq 2000 by generating paired-end and mate-pair libraries with an average insert size of 180 bp, 380 bp, 2 kbp and 6 kbp. The length of reads for each library was 100 bp. Duplicate paired reads were filtered out from each library with FastUniq v1.1 , and reads that were contaminated by Illumina adapter were removed with the cutadapt tool . Subsequently, reads with ~370×/~330×, ~100×, ~100× and ~100× coverage from each library, respectively, were used to perform the assembly. ALLPATHS-LG Release 42411  was applied to assemble the genomes, which begins by correcting sequencing errors. The GapCloser version 1.12  program was used on the resulting scaffolds to close gaps. After that, ICORN  was used to perform corrections on the assembly. Finally, six remaining gaps were completely closed by additional PCR experiments. More details are shown in Additional file 2.
As the GM4792 strains are very closed to the strain MG1655, the annotations of GM4792 strains were firstly transferred from MG1655 using RATT . And then, de novo annotation was performed on both those regions with imperfectly transferred annotations and the insertions with respect to the stain MG1655. tRNA and rRNA were identified using tRNAscan-SE v1.3.1  and RNAmmer v1.2 , respectively. Coding sequences (CDSs) were identified using Prodigal v2.5 . CDSs were translated and analyzed using the NCBI nonredundant database, UniProt (released 2012-10) , InterPro v40 , TIGRFAMs , Pfam , and COG  databases for functional annotation. Genes with signal peptides and transmembrane helices were predicted with TMHMM v2.0  and SignalP v4.0 , respectively. Clustered regularly interspaced short palindromic repeats (CRISPR) were identified with CRT v1.2 . Transcription factors were identified based on the results of domain identification and the DBD database v2.0 . Gene ontology term assignment was performed using the GO database (released 2013-3-30)  and Blast2Go Pipeline v2.5.0 . Metabolic pathways were constructed based on the KEGG database (Release 76.0)  and KAAS . The complete sets of input parameters used for each program are shown in Table S7 of Additional file 1.
% of Totala,b
% of Totala,c
Genome size (bp)
DNA coding (bp)
DNA G + C (bp)
Protein coding genes
Genes in internal clusters
Genes with function prediction
Genes assigned to COGs
Genes with Pfam domains
Genes with signal peptides
Genes with transmembrane helices
Number of genes associated with general COG functional categories
Translation, ribosomal structure and biogenesis
RNA processing and modification
Replication, recombination and repair
Chromatin structure and dynamics
Cell cycle control, cell division, chromosome partitioning
Signal transduction mechanisms
Cell wall/membrane/envelope biogenesis
Intracellular trafficking, secretion, and vesicular transport
Posttranslational modification, protein turnover, chaperones
Mobilome: prophages, transposons
Energy production and conversion
Carbohydrate transport and metabolism
Amino acid transport and metabolism
Nucleotide transport and metabolism
Coenzyme transport and metabolism
Lipid transport and metabolism
Inorganic ion transport and metabolism
Secondary metabolites biosynthesis, transport and catabolism
General function prediction only
Not in COGs
Insights from the genome sequence
The paired-end reads with an insert size of 380 bp of Lac+ and the scaffolds of Lac- were analyzed using the breseq pipeline v0.20  to identify mutations based on read alignments. Six types of variants, including single-base substitution, multiple-base substitution, insertion, deletion, mobile element insertion, and sequence amplification, could be identified. All mutations containing a variant within the adjacent 20 base pairs were removed. Then, mutations that persisted when mapping the reads of Lac- to the genome of Lac- were removed. All of the retained mutations were manually reviewed using the graphical output of the mapping results. After filtering, only two significant variants were left: one 1-bp deletion in lacI and one synonymous SNP outside of the lac operon (Additional file 1: Table S1). We performed a multiple sequence alignment of the three DNA segments containing the lacI and lac operons from the MG1655, Lac- and Lac+ strains using the CLUSTALW program . We detected a 212-bp deletion, which consisted of the last 16 bp of lacI, all of the lac promoter and operator, and the first 74 bp of lacZ, in both the Lac- and Lac+ genomes compared to MG1655. In the Lac- strain, an insertion of a C at bp 961 generates a stop codon at bp 1281. Lacking the promoter and operator, the lac operon cannot be transcribed. Therefore, the Lac- strain could not utilize lactose. In Lac+, the reverse occurred: a 1-bp deletion in this region. The frameshift mutation 1-bp deletion in lacI led to the loss of the stop codon, and thus, lacI was fused to the lac operon, and consequently, the fused protein was transcribed via the lacI promoter (Additional file 1: Figure S4). Thus, GM4792 Lac+ could catabolize lactose. This transition is in agreement with previous studies [5, 18, 51]. In addition, the GM4792 strains were compared to MG1655 on the whole-genome scale with Mauve version snapshot_2015-02-25 . For GM4792 Lac+, 450 SNPs and 112 indels were identified compared to the MG1655. As to GM4792 Lac-, there were totally 441 SNPs and 109 indels compared to the MG1655. More details are shown in Additional file 1: Tables S2–S5.
Phenotypic analysis revealed that the lactose marker was neutral under the conditions used in our studies of experimental evolution of E. coli in a nitrogen-limited environment; the ratio of fitness between GM4792 Lac- and GM4792 Lac+ was 1.00 (0.994 ~ 1.036, 95 % confidence interval) . Therefore, at both the genotypic and phenotypic levels, these two strains differ only by their ability to utilize lactose, indicating that GM4792 Lac+ and GM4792 Lac- are a good system for studies of population evolution and adaption.
This study presents two closely related genomes, E. coli lab strains K-12 GM4792 Lac+ and GM4792 Lac-, which lay a solid foundation for future variant analysis of evolved lines at the genome scale in evolutionary experiments. A whole-genome comparison of GM4792 Lac+ and GM4792 Lac- reveals that the extent of genome-wide differences between GM4792 Lac+ and GM4792 Lac- are not significant and are isolated to the loci related to the utilization of lactose. Only two significant variants have been detected. One is a synonymous SNP, and the other is 1-bp deletion that is responsible for lactose utilization in GM4792 Lac+. Moreover, phenotypic analysis also showed that GM4792 Lac+ and GM4792 Lac- are nearly identical regarding survivability, except for lactose utilization, in a nitrogen-limited environment. All of the results indicate that GM4792 Lac+ and GM4792 Lac- with neutral markers are ideal systems for future experimental evolution studies.
We thank two anonymous reviewers for their invaluable comments and suggestions. The authors gratefully acknowledge the generous help of M. G. Marinus for providing us GM4792. We also thank Hong-Tao Song for useful comments on the manuscript. This work was supported by the National Natural Science Foundation of China (Grant No. 31421063) and the State Key Laboratory of Earth Surface Processes and Resource Ecology (Grant No. 2013-ZY-10).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Barrick JE, Lenski RE. Genome dynamics during experimental evolution. Nat Rev Genet. 2013;14(12):827–39.PubMed CentralView ArticlePubMedGoogle Scholar
- Elena SF, Lenski RE. Evolution experiments with microorganisms: the dynamics and genetic bases of adaptation. Nat Rev Genet. 2003;4(6):457–69.View ArticlePubMedGoogle Scholar
- Barrick JE, Kauth MR, Strelioff CC, Lenski RE. Escherichia coli rpoB mutants have increased evolvability in proportion to their fitness defects. Mol Biol Evol. 2010;27(6):1338–47.PubMed CentralView ArticlePubMedGoogle Scholar
- Lenski R, Rose M, Simpson S, Tadler S. Long-term experimental evolution in Escherichia coli. I. Adaptation and divergence during 2,000 generations. American naturalist. 1991;138(6):1315–41.View ArticleGoogle Scholar
- Foster PL, Trimarchi JM. Adaptive reversion of a frameshift mutation in Escherichia coli by simple base deletions in homopolymeric runs. Science. 1994;265(5170):407–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, et al. The complete genome sequence of Escherichia coli K-12. Science. 1997;277(5331):1453.View ArticlePubMedGoogle Scholar
- Meier-Kolthoff JP, Hahnke RL, Petersen J, Scheuner C, Michael V, Fiebig A, et al. Complete genome sequence of DSM 30083T, the type strain (U5/41T) of Escherichia coli, and a proposal for delineating subspecies in microbial taxonomy. Stand Genomic Sci. 2014;9(1):2.PubMed CentralView ArticlePubMedGoogle Scholar
- Allocati N, Masulli M, Alexeyev MF, Di Ilio C. Escherichia coli in Europe: An Overview. Int J Environ Res Public Health. 2013;10(12):6235–54.PubMed CentralView ArticlePubMedGoogle Scholar
- Kaper JB, Nataro JP, Mobley HLT. Pathogenic Escherichia coli. Nat Rev Microbiol. 2004;2(2):123–40.View ArticlePubMedGoogle Scholar
- Tee TW, Chowdhury A, Maranas CD, Shanks JV. Systems metabolic engineering design: Fatty acid production as an emerging case study. Biotechnol Bioeng. 2014;111(5):849–57.PubMed CentralView ArticlePubMedGoogle Scholar
- Wen M, Bond-Watts BB, Chang MCY. Production of advanced biofuels in engineered E. coli. Curr Opin Chem Biol. 2013;17(3):472–9.View ArticlePubMedGoogle Scholar
- Donovan C, Bramkamp M. Cell division in Corynebacterineae. Frontiers in Microbiology. 2014;5.
- Rosano GL, Ceccarelli EA. Recombinant protein expression in Escherichia coli: advances and challenges. Frontiers in Microbiology. 2014;5.
- Kuzminov A. The chromosome cycle of prokaryotes. Mol Microbiol. 2013;90(2):214–27.PubMed CentralPubMedGoogle Scholar
- Whitfield C, Roberts IS. Structure, assembly and regulation of expression of capsules in Escherichia coli. Mol Microbiol. 1999;31(5):1307–19.View ArticlePubMedGoogle Scholar
- Cooper KK, Mandrell RE, Louie JW, Korlach J, Clark TA, Parker CT, et al. Comparative genomics of enterohemorrhagic Escherichia coli O145:H28 demonstrates a common evolutionary lineage with Escherichia coli O157:H7. BMC Genomics. 2014;15.
- Kang Z, Zhang C, Zhang J, Jin P, Zhang J, Du G, et al. Small RNA regulators in bacteria: powerful tools for metabolic engineering and synthetic biology. Appl Microbiol Biotechnol. 2014;98(8):3413–24.View ArticlePubMedGoogle Scholar
- Foster PL, Trimarchi JM. Adaptive reversion of an episomal frameshift mutation in Escherichia coli requires conjugal functions but not actual conjugation. Proc Natl Acad Sci U S A. 1995;92(12):5487–90.PubMed CentralView ArticlePubMedGoogle Scholar
- Coulondre C, Miller JH. Genetic studies of the lac repressor: III. Additional correlation of mutational sites with specific amino acid residues. J Mol Biol. 1977;117(3):525–67.View ArticlePubMedGoogle Scholar
- Miller JH. Experiments in molecular genetics. Cold Spring Harbor Laboratory: Cold Spring Harbor; 1972.Google Scholar
- Miller JH. A short course in bacterial genetics. Cold Spring Harbor: Cold Spring Harbor Laboratory; 1992.Google Scholar
- Ni C. The experimental evolution of Escherichia coli in nitrogen limited environment, PhD thesis. Beijing: Normal University, College of Life Sciences; 2010.Google Scholar
- Deatherage DE, Barrick JE. Identification of mutations in laboratory-evolved microbes from next-generation sequencing data using breseq. Methods Mol Biol. 2014;1151:165–88.PubMed CentralView ArticlePubMedGoogle Scholar
- Topley WWC, Wilson GS. The Principles of Bacteriology and Immunity. 2nd ed. 1936.
- Welch RA. The genus Escherichia. The Prokaryotes. New York: Springer; 2006. p. 60–71.
- Schultz F, Strockbine N. Genus I. Escherichia Castellani and Chalmers 1919, 941TAL. In: Brenner DJ KN, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology, vol. 2. 2nd ed. New York: Springer; 2005. p. 607–24. The Proteobacteria.Google Scholar
- Pagani I, Liolios K, Jansson J, Chen IMA, Smirnova T, Nosrat B, et al. The Genomes OnLine Database (GOLD) v. 4: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 2012;40(D1):D571–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Xu H, Luo X, Qian J, Pang X, Song J, Qian G, et al. FastUniq: A aast de novo duplicates removal tool for paired short reads. PLoS One. 2012;7(12):e52249.PubMed CentralView ArticlePubMedGoogle Scholar
- Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet journal. 2011;17(1):10–2.View ArticleGoogle Scholar
- Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A. 2011;108(4):1513–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Luo RB, Liu BH, Xie YL, Li ZY, Huang WH, Yuan JY, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1:6.View ArticleGoogle Scholar
- Otto TD, Sanders M, Berriman M, Newbold C. Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology. Bioinformatics. 2010;26(14):1704–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Otto TD, Dillon GP, Degrave WS, Berriman M. RATT: rapid annotation transfer tool. Nucleic Acids Res. 2011;39(9):7.View ArticleGoogle Scholar
- Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25(5):955–64.PubMed CentralView ArticlePubMedGoogle Scholar
- Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35(9):3100–8.PubMed CentralView ArticlePubMedGoogle Scholar
- Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119.PubMed CentralView ArticlePubMedGoogle Scholar
- Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 2004;32:D115–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Hunter S, Jones P, Mitchell A, Apweiler R, Attwood TK, Bateman A, et al. InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res. 2011;40(D1):D306–12.PubMed CentralView ArticlePubMedGoogle Scholar
- Haft DH, Selengut JD, White O. The TIGRFAMs database of protein families. Nucleic Acids Res. 2003;31(1):371–3.PubMed CentralView ArticlePubMedGoogle Scholar
- Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, et al. Pfam: the protein families database. Nucleic Acids Res. 2014;42(D1):D222–30.PubMed CentralView ArticlePubMedGoogle Scholar
- Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 2000;28(1):33–6.PubMed CentralView ArticlePubMedGoogle Scholar
- Krogh A, Larsson B, von Heijne G, Sonnhammer ELL. Predicting transmembrane protein topology with a hidden Markov model: Application to complete genomes. J Mol Biol. 2001;305(3):567–80.View ArticlePubMedGoogle Scholar
- Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides: SignalP 3.0. J Mol Biol. 2004;340(4):783–95.View ArticlePubMedGoogle Scholar
- Bland C, Ramsey TL, Sabree F, Lowe M, Brown K, Kyrpides NC, et al. CRISPR Recognition Tool (CRT): a tool for automatic detection of clustered regularly interspaced palindromic repeats. BMC Bioinformatics. 2007;8(1):209.PubMed CentralView ArticlePubMedGoogle Scholar
- Kummerfeld SK. DBD: a transcription factor prediction database. Nucleic Acids Res. 2006;34:D74–81.PubMed CentralView ArticlePubMedGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene Ontology: tool for the unification of biology. Nat Genet. 2000;25(1):25–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–6.View ArticlePubMedGoogle Scholar
- Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 1999;27(1):29–34.PubMed CentralView ArticlePubMedGoogle Scholar
- Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35(Web Server):W182–5.PubMed CentralView ArticlePubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994;22(22):4673–80.PubMed CentralView ArticlePubMedGoogle Scholar
- MüLLER-HILL B, KANIA J. Lac repressor can be fused to β-galactosidase. Nature. 1974;249(5457):561–3.View ArticlePubMedGoogle Scholar
- Darling ACE, Mau B, Blattner FR, Perna NT. Mauve: Multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14(7):1394–403.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang Y, Lin K. A phylogenomic analysis of Escherichia coli / Shigella group: implications of genomic features associated with pathogenicity and ecological adaptation. BMC Evol Biol. 2012;12:174.PubMed CentralView ArticlePubMedGoogle Scholar
- Hazen TH, Sahl JW, Fraser CM, Donnenberg MS, Scheutz F, Rasko DA. Refining the pathovar paradigm via phylogenomics of the attaching and effacing Escherichia coli. Proc Natl Acad Sci U S A. 2013;110(31):12810–5.PubMed CentralView ArticlePubMedGoogle Scholar
- Minkin I, Patel A, Kolmogorov M, Vyahhi N, Pham S. Sibelia: a scalable and comprehensive synteny block generation tool for closely related microbial genomes. Proceedings of Algorithms in Bioinformatics. Berlin: Springer; 2013. p. 215-29.
- Felsenstein J. Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol. 1981;17(6):368–76.View ArticlePubMedGoogle Scholar
- Price MN, Dehal PS, Arkin AP. FastTree 2 – approximately maximum-likelihood trees for large alignments. PLoS One. 2010;5(3), e9490.PubMed CentralView ArticlePubMedGoogle Scholar
- Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol. 2008;26(5):541–7.PubMed CentralView ArticlePubMedGoogle Scholar
- Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci. 1990;87(12):4576–9.PubMed CentralView ArticlePubMedGoogle Scholar
- Garrity GM BJ, Lilburn T. Phylum XIV. Proteobacteria phyl nov. In: Brenner DJ KN, Stanley JT, Garrity GM, editors. Bergey’s Manual of Systematic Bacteriology, vol. 2. 2nd ed. New York: Springer; 2005. p. 1. The Proteobacteria part B The Gammaproteobacteria.View ArticleGoogle Scholar
- Garrity GMBD, Lilburn T. Class III. Gammaproteobacteria class. nov. In: Garrity GM BD, Krieg NR, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology, vol. 2. 2nd ed. New York: Springer; 2005. p. 1. Part B.View ArticleGoogle Scholar
- Garrity GM, Holt JG. Taxonomic outline of the Archaea and Bacteria. Bergey’s Manual of Systematic Bacteriology. 2001;1:155–66.Google Scholar
- Brenner DJ. Family I. Enterobacteriaceae Rahn 1937, Nom. fam. cons. Opin. 15, Jud. Com. 1958, 73; Ewing, Farmer, and Brenner 1980, 674; Judicial Commission 1981, 104. In: Krieg NRHJ, editor. Bergey’s Manual of Systematic Bacteriology, vol. 1. 1st ed. Baltimore: The Williams & Wilkins Co; 1984. p. 408–20.Google Scholar
- Escherich T. Die Darmbakterien des Säuglings und ihre Beziehungen zur Physiologie der Verdauung. Stuttgart: Ferdinand Enke; 1886: p. 63–74.
- Editorial Board (for the Judicial Commission of the International Committee on Bacteriological Nomenclature). Opinion 26: designation of neotype strains (cultures) of type species of the bacterial genera Salmonella, Shigella, Arizona, Escherichia, Citrobacter and Proteus of the family Enterobacteriaceae. Int J Syst Evol Microbiol. 1963;13:35–6.Google Scholar
- List of growth media used at the DSMZ. [http://www.dsmz.de].