High quality draft genome sequence of Segniliparus rugosus CDC 945T= (ATCC BAA-974T)
© The Author(s) 2011
Published: 31 December 2011
Segniliparus rugosus represents one of two species in the genus Segniliparus, the sole genus in the family Segniliparaceae. A unique and interesting feature of this family is the presence of extremely long carbon-chain length mycolic acids bound in the cell wall. S. rugosus is also a medically important species because it is an opportunistic pathogen associated with mammalian lung disease. This report represents the second species in the genus to have its genome sequenced. The 3,567,567 bp long genome with 3,516 protein-coding and 49 RNA genes is part of the NIH Roadmap for Medical Research, Human Microbiome Project.
KeywordsSegniliparaceae genome sequencing Human Microbiome Project
Strain CDC 945T (= ATCC BAA-974T = CIP 10838T = DSM 45345 = CCUG 50838T = JCM 13579T) is the type strain of the species Segniliparus rugosus in the Segniliparaceae family . The genus name was created to acknowledge the presence of novel long carbon-chain fatty acids (mycolic acids) detected using the Mycobacterium species identification method with high performance liquid chromatography (HPLC) . The name was formed from the Latin adjective ‘segnis’, meaning ‘slow’ and combined with the Greek adjective ‘liparos’ for ‘fatty’, to indicate the ‘one with slow fats’. The name relates to the late elution of the apolar, alpha-mycolic acids (fatty acids) during HPLC analysis . The specific epithet for the taxon name is from the Latin adjective ‘rugosus’, referring to the formation of wrinkled, rough colony morphology . The type strain of S. rugosus, CDC 945T, was isolated from a human sputum specimen collected in Alabama, USA . S. rugosus has been isolated from multiple patients with cystic fibrosis in the U.S. and Australia and appears to be a respiratory opportunistic pathogen [3,4]. A recent isolation from a ∼1 year old sea lion showing third-stage malnutrition with a 30% loss of body weight, moderate bradycardia and severe hypothermia, suggests a possible aquatic or marine niche for the species . The only other validly named species of the genus is Segniliparus rotundus (CDC 1076T), which is the type strain of this species. S. rotundus. CDC 1076 shares 98.9% 16S rRNA sequence identity with S. rugosus CDC 945T, although the DNA-DNA hybridization is less than 28% . The complete genome of S. rotundus was recently reported and has 3,157,527 bp with 3,081 protein-coding and 52 RNA genes . Here we present a summary classification and a set of features for S. rugosus CDC 945T, together with the description of the high quality draft genomic sequencing and annotation.
Classification and features
Classification and general features of S. rugosus CDC 945T according to the MIGS recommendations .
Species Segniliparus rugosus
Type strain CDC 945
mesophile, 22–42 °C
D-glucose, glycerol, maltose, mannitol, D-sorbitol and trehalose
environmental water suggested
Sample collection time
Results with the API CORYNE test kit shows CDC 945T is positive for β-glucosidase, and pyrazinamidase activities and negative for alkaline phosphatase, β-galactosidase, β-glucuronidase, α-glucosidase, N-acetyl-β-glucosaminidase and pyrrolidonyl arylamidase activity at 33oC . It is susceptible to imipenem 4ug/ml, moxifloxacin 0.5 µg/ml, and trimethoprim-sulfamethoxazole < 4.8 ug/ml, intermediate to cefoxitan 64 ug/ml and resistant to amikacin >128 ug/ml, clarithromycin 32 ug/ml, ciprofloxacin 16 ug/ml, ethambutol >16 ug/ml and tobramycin >64 ug/ml. [1,3]. Strain CDC 945T uses D-glucose, glycerol, maltose, mannitol, D-sorbitol and trehalose as sole carbon sources with the production of acid. No growth on adonitol, L-arabinose, cellobiose, citrate, dulcitol, i-erythriol, galactose, i-myo-inositol, lactose, mannose, melibiose, raffinose, L-rhamnose, salicin or sodium citrate . The strain hydrolyzes urea but not acetamide adenine, casein, aesculin, hypoxanthine, tyrosine or xanthine .
The cell wall of strain CDC 945T contains mycolic acids and meso-diaminopimelic acid . The mycolic acid pattern developed with HPLC is a double cluster of peaks emerging at 7.24 min and the last peak group is unresolved and elutes slightly before the 110 carbon chain length, high molecular weight internal standard [1,2]. Thin layer chromatography confirms 2 groups of apolar, α- and α’-alpha-mycolic acids lacking oxygen function, other than the hydroxyl group . The HPLC and TLC results indicate that this strain produces a unique homologous subclass of long, alpha-mycolic acids with additional 90 to 110 carbons . The fatty acid profile by gas-liquid chromatography is C10:0 (8.65%), C12:0 (1.33%), C14:0 (8.49%), C16:0 (18.34%), C18:1ω9c (8.93%), C18:010-methyl (tuberculostearic acid, 21.62%), and C20 (28.51%) .
Genome sequencing and annotation
Genome project history
Genome sequencing project information
High Quality Draft
Two 454 pyrosequence libraries, one standard 0.6kb fragment library and one 2.5kb jump library
Newbler Assembler version 2.3 PostRelease-11/19/2009
Gene calling method
Glimmer; Metagene; PFAM; BLAST to non-redundant protein database; manual curation
Genbank Date of Release
November 10, 2010
NCBI project ID
Source material identifier
Human Microbiome Project
Growth conditions and DNA isolation
Strain CDC 945T was grown statically in Middlebrook 7H9 medium at 33oC until late log. DNA was isolated from whole cells after a chloroform/methanol wash with a disruption solution of guanidine thiocyamate, sarkosyl and mercaptoethanol as described in Mve-Obiang et al. . The purity of DNA was assessed by The Broad Institute using the Quant-iT™ dsDNA Assay High Sensitivity Kit (Invitrogen, Carlsbad, CA) and according to the manufacturer’s protocol.
Genome sequencing and assembly
The genome of Segniliparus rugosus ATCC BAA-974 was sequenced using 454 pyrosequence fragment and jump libraries . We assembled the 454 data, consisting of 135,510 fragment reads and 112,271 jump reads, using Newbler Assembler version 2.3 PostRelease-11/19/2009. The assembly is considered High-Quality Draft and consists of 262 contigs arranged in 30 scaffolds with a total size of 3,567,567 bases. The error rate of this draft genome sequence is less than 1 in 10,000 (accuracy of ∼ Q40). Average sequence coverage is 13×. Assessment of coverage, GC content, contig BLAST and 16S contig classification were consistent with the species Segniliparus.
Protein-coding genes were predicted using four ORF-finding tools: GeneMark , Glimmer3 , Metagene , and findBlastOrfs (unpublished). This latter tool builds genes by extending whole-genome blast alignments, in-frame, to include start and stop codons. The final set of non-overlapping ORFs was selected from the output of these tools using an in-house gene-caller, which uses dynamic programming to score candidate gene models based on strength of similarity to entries in UniRef90, then selects non-overlapping genes that, combined, have the highest overall score. In cases where predictions overlapped non-coding RNA features (see below), the genes were manually inspected and removed when necessary. Finally, the gene set was reviewed using both the NCBI discrepancy report and the internal Broad annotation metrics. Ribosomal RNAs (rRNAs) were identified with RNAmmer . The tRNA features were identified using tRNAScan . Other non-coding features were identified with RFAM . The gene product names were assigned based on Hmmer equivalogs from TIGRfam and Pfam, and blast hits to KEGG and SwissProt protein sequence databases. This was done using the naming tool “Pidgin” .
% of Total
Genome size (bp)
DNA coding region (bp)
DNA G+C content (bp)
Number of replicons
Pseudo genes (partial genes)
Genes with function prediction
Genes in paralog clusters
Genes assigned to COGS
Genes assigned Pfam domains
Genes with signal peptides
Genes with transmembrane helices
Number of genes associated with the general COG functional categories
Translation, ribosomal structure and biogenesis
RNA processing and modification
Replication, recombination and repair
Chromatin structure and dynamics
Cell cycle control, cell division, chromosome partitioning
Signal transduction mechanisms
Cell wall/membrane/envelope biogenesis
Intracellular trafficking and secretion, and vesicular transport
Posttranslational modification, protein turnover, chaperones
Energy production and conversion
Carbohydrate transport and metabolism
Amino acid transport and metabolism
Nucleotide transport and metabolism
Coenzyme transport and metabolism
Lipid transport and metabolism
Inorganic ion transport and metabolism
Secondary metabolites biosynthesis, transport and catabolism
General function prediction only
Not in COGs
The authors gratefully acknowledge the Broad Genome Sequencing Platform, Lucia Alvarado-Balderrama for data submission and Susanna G. Hamilton for project management. We also acknowledge NIH for funding this project with grants to the Broad Institute (grants HHSN272200900017C and U54-HG004969).
- Butler WR, Floyd MM, Brown JM, Toney SR, Daneshvar MI, Cooksey RC, Carr J, Steigerwalt AG, Charles N. Novel mycolic acid-containing bacteria in the family Segniliparaceae fam. nov., including the genus Segniliparus gen. nov., with descriptions of Segniliparus rotundus sp. nov. and Segniliparus rugosus sp. nov. Int J Syst Evol Microbiol 2005; 55:1615–1624. PubMed doi:10.1099/ijs.0.63465-0View ArticlePubMedGoogle Scholar
- Butler WR, Guthertz LS. Mycolic acid analysis by high-performance liquid chromatography for identification of Mycobacterium species. Clin Microbiol Rev 2001; 14:704–726. PubMed doi:10.1128/CMR.14.4.704-726.2001PubMed CentralView ArticlePubMedGoogle Scholar
- Butler WR, Sheils CA, Brown-Elliott BA, Charles N, Colin AA, Gant MJ, Goodill J, Hindman D, Toney SR, Wallace RJ, Jr., et al. First isolations of Segniliparus rugosus from patients with cystic fibrosis. J Clin Microbiol 2007; 45:3449–3452. PubMed doi:10.1128/JCM.00765-07PubMed CentralView ArticlePubMedGoogle Scholar
- Hansen T, Van-Kerckhof J, Jelfs P, Wainwright C, Ryan P, Coulter C. Segniliparus rugosus infection, Australia. Emerg Infect Dis 2009; 15:611–613. PubMed doi:10.3201/eid1504.081479PubMed CentralView ArticlePubMedGoogle Scholar
- Evans RH. Segniliparus rugosus-associated bronchiolitis in California sea lion. Emerg Infect Dis 2011; 17:311–312. PubMedPubMed CentralView ArticlePubMedGoogle Scholar
- Sikorski J, Lapidus A, Copeland A, Misra M, Glavina Del Rio T, Nolan M, Lucas S, Chen F, Tice H, Cheng JF, et al. Complete genome sequence of Segniliparus rotundus type strain (CDC 1076). Stand Genomic Sci 2010; 2:203–211. PubMed doi:10.4056/sigs.791633PubMed CentralView ArticlePubMedGoogle Scholar
- Guindon S, Gascuel O. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 2003; 52:696–704. 54QHX07WB5K5XCX4 [pii]View ArticlePubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994; 22:4673–4680. PubMed doi:10.1093/nar/22.22.4673PubMed CentralView ArticlePubMedGoogle Scholar
- Tamura K, Dudley J, Nei M, Kumar S. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol 2007; 24:1596–1599. PubMed doi:10.1093/molbev/msm092View ArticlePubMedGoogle Scholar
- Liolios K, Chen IM, Mavromatis K, Tavernarakis N, Hugenholtz P, Markowitz VM, Kyrpides NC. The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2010; 38:D346–3D354.PubMed CentralView ArticlePubMedGoogle Scholar
- Yarza P, Ludwig W, Euzeby J, Amann R, Schleifer KH, Glockner FO, Rossello-Mora R. Update of the All-Species Living Tree Project based on 16S and 23S rRNA sequence analyses. Syst Appl Microbiol 2010; 33:291–299. PubMed doi:10.1016/j.syapm.2010.08.001View ArticlePubMedGoogle Scholar
- Peterson J, Garges S, Giovanni M, McInnes P, Wang L, Schloss JA, Bonazzi V, McEwen JE, Wetterstrand KA, Deal C, et al. The NIH Human Microbiome Project. Genome Res 2009; 19:2317–2323. PubMed doi:10.1101/gr.096651.109PubMed CentralView ArticlePubMedGoogle Scholar
- The National Institutes of Health. Data Analysis and Coordination Center for Human Microbiome Project. http://www.hmpdacc.org.
- Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. GenBank. Nucleic Acids Res 2011; 39:D32–D37. PubMed doi:10.1093/nar/gkq1079PubMed CentralView ArticlePubMedGoogle Scholar
- The Broad Institute. http://www.broadinstitute.org.
- Mve-Obiang A, Mestdagh M, Portaels F. DNA isolation from chloroform/methanol-treated mycobacterial cells without lysozyme and proteinase K. Biotechniques 2001; 30:272–274, 276. PubMedPubMedGoogle Scholar
- Lennon NJ, Lintner RE, Anderson S, Alvarez P, Barry A, Brockman W, Daza R, Erlich RL, Giannoukos G, Green L, et al. A scalable, fully automated process for construction of sequence-ready barcoded libraries for 454. Genome Biol 2010; 11:R15. PubMed doi:10.1186/gb-2010-11-2-r15PubMed CentralView ArticlePubMedGoogle Scholar
- Borodovsky M, McIninch J. GENMARK: Parallel gene recognition for both DNA strands. Comput Chem 1993; 17:123–133. doi:10.1016/0097-8485(93)85004-VView ArticleGoogle Scholar
- Delcher AL, Harmon D, Kasif S, White O, Salzberg SL. Improved microbial gene identification with GLIMMER. Nucleic Acids Res 1999; 27:4636–4641. PubMed doi:10.1093/nar/27.23.4636PubMed CentralView ArticlePubMedGoogle Scholar
- Noguchi H, Park J, Takagi T. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences. Nucleic Acids Res 2006; 34:5623–5630. PubMed doi:10.1093/nar/gkl723PubMed CentralView ArticlePubMedGoogle Scholar
- Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 2007; 35:3100–3108. PubMed doi:10.1093/nar/gkm160PubMed CentralView ArticlePubMedGoogle Scholar
- Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997; 25:955–964. PubMed doi:10.1093/nar/25.5.955PubMed CentralView ArticlePubMedGoogle Scholar
- Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A. Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 2005; 33:D121–D124. PubMed doi:10.1093/nar/gki081PubMed CentralView ArticlePubMedGoogle Scholar
- The Broad Institute. Automated gene naming tool. http://genepidgin.sourceforge.net.
- Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 2008; 26:541–547. PubMed doi:10.1038/nbt1360PubMed CentralView ArticlePubMedGoogle Scholar
- Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA 1990; 87:4576–4579. PubMed doi:10.1073/pnas.87.12.4576PubMed CentralView ArticlePubMedGoogle Scholar
- Garrity GM, Holt JG. The Road Map to the Manual. In Garrity GM, Boone DR, Castenholz, RW. (eds). Bergey’s Manual of Systematic Bacteriology, Second edition. New York: Springer; 2001. p119–169.View ArticleGoogle Scholar
- Stackebrandt E, Rainey FA, Ward-Rainey NL. Proposal for a New Hierarchic Classification System, Actinobacteria classic nov. Int J Syst Bacteriol 1997; 47:479–491. doi:10.1099/00207713-47-2-479View ArticleGoogle Scholar
- Zhi XY, Li WJ, Stackebrandt E. An update of the structure and 16S rRNA gene sequence-based definition of higher ranks of the class Actinobacteria, with the proposal of two new suborders and four new families and emended descriptions of the existing higher taxa. Int J Syst Evol Microbiol 2009; 59:589–608. PubMed doi:10.1099/ijs.0.65780-0View ArticlePubMedGoogle Scholar
- Chosewood L, Wilson D, eds. Biosafety in Microbiological and Biomedical Laboratories (BMBL). 5th ed. rev. Dec. 2009 ed. Washington D.C.: US Department of Health and Human Services, Public Health Service, Centers for Disease Control and Prevention, National Institute of Health; 2009. pnas.050566797. 8. Ironside JW and JE Bell.Google Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25–29. PubMed doi:10.1038/75556PubMed CentralView ArticlePubMedGoogle Scholar