High quality draft genome sequence of Corynebacterium ulceribovis type strain IMMIB-L1395T (DSM 45146T)

Corynebacterium ulceribovis strain IMMIB L-1395T (= DSM 45146T) is an aerobic to facultative anaerobic, Gram-positive, non-spore-forming, non-motile rod-shaped bacterium that was isolated from the skin of the udder of a cow, in Schleswig Holstein, Germany. The cell wall of C. ulceribovis contains corynemycolic acids. The cellular fatty acids are those described for the genus Corynebacterium, but tuberculostearic acid is not present. Here we describe the features of C. ulceribovis strain IMMIB L-1395T, together with genome sequence information and its annotation. The 2,300,451 bp long genome containing 2,104 protein-coding genes and 54 RNA-encoding genes and is part of the Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project. Electronic supplementary material The online version of this article (doi:10.1186/s40793-015-0036-7) contains supplementary material, which is available to authorized users.


Introduction
Corynebacterium ulceribovis IMMIB L-1395 T (= DSM 45146 = CCUG 55727) was first isolated from the skin of the udder of a cow with a profound ulceration [1]. The classification and identification of this species was based on chemotaxonomic traits and biochemical tests, which were supplemented by 16S rRNA gene phylogentic assessments. Since then, there have been neither reported cases associating strains of C. ulceribovis with animal infections nor has there been documented cases of its isolation in humans. Although members of the genus Corynebacterium are generally regarded as commensal skin colonizer in humans and animals, e.g. Corynebacterium amycolatum, Corynebacterium bovis, Corynebacterium mastitidis, Corynebacterium pseudotuberculosis, Corynebacterium xerosis and Corynebacterium ulcerans [2][3][4], the question remains unanswered whether to consider C. ulceribovis as belonging to the resident or transient microbes of bovine skin. Therefore, the veterinary medical importance of C. ulcerobovis is unclear and remains to be assessed.
Here we present a summary classification and a set of features for C. ulceribovis IMMIB-L1395 T together with the description of the complete genomic sequencing and annotation of DSM 45146 T providing insights into candidate genes involved in some basic biological processes.

Classification and features
Following the published hierarchial classification of Actinobacteria [5,6], C. ulceribovis belongs to the genus Corynebacterium of the family Corynebacteriaceae, one of six suprageneric taxa included in the suborder Corynebacterineae of the order Actinomycetales of the subclass Actinobacteridae of the class Actinobacteria.

Chemotaxonomy
C. ulceribovis has cell-wall chemotype IV, which includes the presence of meso-diaminopimelate (meso-DAP), arabinose and galacose. Corynemycolic acids are present. The major cellular fatty acids are palmitic (C 16 : 0 ) and oleic (C 18 : 1 ω9c) acids, which constitute more than 95 % of the total fatty acids content. Tuberculostearic acid is not present [1]. The G + C content calculated from the genome draft sequence is 59.2 mol%. No information is available on the poar lipid or respiratory lipoquinone composition.

16S rRNA gene analysis and phylogeny
Phylogenetic analyses were performed using the ARBpackage [7]. Evolutionary distances were calculated using the Jukes-Cantor method [8]. Phylogenetic trees were generated by maximum-parsimony (ARB_PARS), neighbourjoining and maximum-likelihood (RAxML; [9]) facilities as implemented in the ARB package. Topologies of the neighbour-joining tree were evaluated using bootstrap analyses [10] based on 500 resamplings. The sequence of the single 16S rRNA gene copy (1397 nucleotides) in the genome of C. ulceribovis DSM 45146 T was added to the ARB database [7] and compared with the 16S rRNA gene sequences of the type strains of Corynebacterium species obtained from the NCBI database. This sequence does not differ from the previously published 16S rRNA sequence (AM922112). The highest-scoring sequence of a neighboring species was (HE983829) reported for the type strain of C. lactis DSM 45799 T , which showed a similarity of 96.5 %. Figure 2 shows the phylogenetic position of C. ulceribovis DSM 45146 T within the genus Corynebacterium in a 16S rRNA based tree. It is evident from the tree that C. ulceribovis DSM 45146 T together with C. amycolatum, C. lactis, C sphenisci, C. sputi, C. hansenii, C. freneyi and C. xerosis constitute a distinct monophyletic group within the genus Corynebacterium. The clustering of this group of species was also observed in recent study of the phylogeny of the 16S rRNA gene in Actinobacteria [11]. The coherency of members of this clade To further study the phylogenetic relationship between C. ulceribovis and the type strains of some members of this subcluster such as C. freneyi and C. sputi, whose genome sequences are available, we compared homologous proteins annotated as polyketide synthase (Pks13), fatty acid CoA ligase (FadD32), trehalose corynomycolyl transferase (CmtC) and acetyl coA carboxylase (AccD3), enzymes which form an integral part of the mycolic acid biosynthetic pathway. BLASTP analysis showed that the average amino acid identity between homologous pairs from C. ulceribovis, C. freneyi and C. sputi was around 79 % for AccD3, 62 % for Pks13, 63 % for FadD32 and 49 % for CmtC. The phylogenetic trees constructed using the maximum likelihood and neighbor-joining methods based on this data set of protein sequences showed that C. ulceribovis, C. freneyi and C. sputi clustered adjacent to each other within the genus Corynebacterium (data not shown). Thus, one may hypothesize that this monophyletic group deserves to be recognized as the core of a new genus. However, expanded datasets are needed to affirm the phylogenetic relationship between members of this clade and better resolve the intrageneric relationship between them. In addition, further Phylum Actinobacteria TAS [5] Class Actinobacteria TAS [5] Order Actinomycetales TAS [5] Family Corynebacteriaceae TAS [6] Genus Corynebacterium TAS [130] Species Evidence codes -IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [126] study will be required to identify synapomorphies to delineate this lineage before a taxonomic conclusion can be made.

Genome sequencing and annotation
Genome project history The strain was selected for sequencing on the basis of its phylogenetic position [12,13], and is part of the Genomic Encyclopedia of Type Strains, Phase I: the one thousand microbial genomes (KMG) project [14], a follow-up of the Genomic Encyclopedia of Bacteria and Archaea (GEBA) pilot project [15], which aims at increasing the sequencing coverage of key reference microbial genomes and to generate a large genomic basis for the discovery of genes encoding novel enzymes [16]. KMG-I is the first of the production phases of the "Genomic Encyclopedia of Bacteria and Archaea: sequencing a myriad of type strains initiative and a Genomic Standards Consortium project [17]. The genome project is deposited in the Genomes On Line Database [18] and the genome sequence is available from GenBank. Sequencing, finishing and annotation were performed by the DOE Joint Genome Institute (JGI) using state of the art sequencing technology [19]. A summary of the project information is presented in Table 2.  with the following modifications for the cell lysis solution: additional digest with 1 μl proteinase K (50 μg/μl), 7.5 μl achromopetidase (1 U/μl), 7.5 μl lysostaphin (1 U/μl), 3 μl lysozym (700 U/μl) and 7.5 μl mutanolysin (1 U/μl). Protein precipitation with 200 μl protein precipitation buffer (PPT) and incubation on ice over night followed by incubation (60 min, 37°C) with 50 μl proteinase K. DNA is available through the DNA Bank Network [21].

Genome sequencing and assembly
The draft genome of C. ulceribovis DSM 45146 T was generated using the Illumina technology [22]. An Illumina Std shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform which generated 17,830,172 reads totaling 2,674.5 Mbp. All general aspects of library construction and sequencing performed at the JGI can be found at [23]. All raw Illumina sequence data was passed through DUK, a filtering program developed at JGI, which removes known Illumina sequencing and library preparation artifacts [24]. The following steps were then performed for assembly: (1) filtered Illumina reads were assembled using Velvet (version 1.1.04) [25], (2) 1-3 Kbp simulated paired end reads were created from Velvet contigs using wgsim [26], (3) Illumina reads were assembled with simulated read pairs using Allpaths-LG (version r41043) [27]. Parameters for assembly steps were

Genome annotation
Genes were identified using Prodigal [28] as part of the DOE-JGI Annotation pipeline [29] followed by a round of manual curation using the JGI GenePRIMP pipeline [30]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation was performed within the Integrated Microbial Genomes [31].

Genome properties
The assembly of the draft genome sequence consists of eight scaffolds amounting to a 2,300,451 bp long chromosome with a GC content of approximately 59.2 % (Table 3 and Fig. 3). Of the 2,158 genes predicted, 2,104 were protein encoding and 54 RNA encoding genes. Within the genome, 22 pseudogenes were also identified. The majority of genes (73.45 %) were assigned a putative function whilst the remaining genes were annotated as hypothetical proteins. The distribution of genes into COGs functional categories is presented in Table 4.
Insights from the genome sequence

Insights into carbohydrate metabolism
As mentioned previously glucose was the primary carbohydrate utilized by C. ulceribovis. This sugar is likely to be imported into the cells by a homolog of the phosphoenolpyruvate (PEP): sugar phosphotransferase system (PTS), which is responsible for the transport and concomitant phosphorylation of various sugars across the cell membrane. Exploring the genome of C. ulceribovis revealed the presence of the genes encoding for the PTS proteins. These include the gene ptsI encoding for enzyme I ([EI], A3ECDRAFT_1792) and the gene ptsH encoding for the histidine carrier protein ([HPr], A3ECD RAFT_1795), as well as the gene ptsG encoding for the glucose-specific enzyme II ([EII Glc ], A3ECDRAFT_1683) and the gene ptsFru encoding for the fructose-specific enzyme II ([EII Fru ], A3ECDRAFT_1794). A single copy of each of these genes was found within the genome of C. ulceribovis. The EI and HPr proteins lack sugar specificity and catalyze the transfer of phosphoryl groups from PEP to EIIs. EIIs are complex enzymes consisted of three protein domains, namely, IIA, IIB and IIC. IIA and IIB are phosphoryl transfer proteins of the PTS, whereas IIC is the actual sugar permease [32,33]. The presence of the ptsG gene confirmed the ability of this organism to utilize glucose as source of carbon and energy. Besides the PTS, the genome of C. ulceribovis contains a set of genes predicted to encode a carbohydrate ABC transporter (A3ECDRAFT_0345 to A3ECDRAFT_0348), which belongs to the CUT1 family (TC 3.A.1.1-). This ABC transporter composed of two homologous genes encoding two permeases (A3ECDRAFT_0345 and A3EC DRAFT_0346), one encoding a substrate-binding protein (A3ECDRAFT_0347) and one encoding ATP-binding protein (A3ECDRAFT_0348). Members of the CUT1 family are known to transport diverse di-and oligosaccharides, glycerol, glycerol-phosphate and polyols [34]. However, the sugar transported by this ABC transporter remains to be determined in C. ulceribovis. The genes encoding this ABC transporter are located downstream from the genes encoding a two component system consisting of a sensor histidine kinase and a response regulator.

Central carbohydrate metabolism
The genes envolved in metabolic pathways were analyzed in detail using the information present in KEGG database [35]. It is apparent from inspection of the genome sequence of C. ulceribovis that the genome contains a complete set of genes coding for the enzymes of the central carbohydrate metabolism, including those that are used in glycolysis, gluconeogenesis, the pentose phosphate pathway (PPP) and the tricarboxylic acid cycle (TCA

Glycogen metabolism
Glycogen, a soluble α-linked glucose polymer (or αglucan) with~90 % α-1,4-links in its backbone and 10 % α-1,6-linked branches, is a source of carbon and energy storage in a wide variety of organisms, including bacteria [36]. Inspection of the genome revealed that C. ulceribovis was equipped with the genes encoding proteins envolved in glycogen biosynthesis by the classical GlgC/GlgA and the GlgE pathways. Key genes encoding enzymes involved in the GlgC/GlgA pathway include: glgC, encoding for glucose- . BLASTP analysis revealed that the Pep2 protein is a maltokinase which forms a complex with trehalose synthase TreS. This is not surprising partly due to the fact that the pep2 (also called mak) gene is usually linked with the treS gene and in some micro-organisms like Psuedomonas entamophila, Rubrobacter xylanophilus and in numerous members of the class Actinobacteria the two genes are fused into a single gene [37][38][39]. The GlgE pathway requires trehalose as a precursor of α-glucan synthesis using the combined action of the four enzymes [37,[40][41][42]. In this pathway, trehalose is first isomerized to maltose by trehalose synthase (TreS). Next, maltose is phosphorylated to maltose-1-phosphate by maltose kinase (Pep2) by expending a molecule of ATP. The phospho-activated disaccharide is a substrate for maltosyltransferase (GlgE). GlgE uses maltose-1-phosphate to elongate α(1 → 4) linked glucan chains. GlgB, the last enzyme of this pathway, mediates α(1 → 6)-branching of the glucan chain [43].

Trehalose metabolism
Trehalose is a disaccharide composed of two glucose units which are linked in an α, α-1,1-glycosidic linkage. It is an energy store and a stress-protectant, helping bacteria to survive desiccation, cold and osmotic stress [44]. Trehalose is also an integral component of cell wall trehalose dimycolates (TDM, cord factor) found in species of the genera Mycobacterium, Nocardia, Rhodococcus and Corynebacterium [45,46]. Inspection of C. ulceribovis genome revealed the presence of genes encoding for proteins envolved in trehalose biosynthesis via the GalU-OtsA-OtsB and the TreY-TreZ pathways. The GalU-OtsA-OtsB pathway is catalyzed by the galU, otsA and otsB gene products, including the enzymes UTP-glucose-  [47,48], whereas the TreY-TreZ pathway involves trehalose biosynthesis from glycogen-like α(1 → 4)-linked glucose polymers [47,49].
Additionally, examination of the genome revealed the presence of a gene encoding for trehalose phosphorylase ([EC:2.4.1.64], A3ECDRAFT_0084). This enzyme catalyzes the phosphorolysis of trehalose to produce glucose-1-phosphate and glucose. This reaction is reversible and could give rise to trehalose from glucose-1-P and glucose [50].

Insight into lipids metabolism Fatty acid biosynthesis
Fatty acids biosynthesis is mediated by enzymes catalyzing several iterative cycles of reaction steps including condensation, reduction, dehydration and reduction [51,52]. The genes encoding for enzymes necessary for fatty acid biosynthesis in C. ulceribovis DSM 45146 T were identified. Inspection of the genome revealed the presence of a single fas1 gene encoding type I fatty acid synthase FAS I ([EC:2.3.1.-], A3ECDRAFT_2083). BLASTP analysis revealed that FAS I (A3ECDRAFT_2083) was identical to homologs (NCgl0802) in C. glutamicum ATCC13032 T and (HMPREF0281_00958) in C. ammoniagens DSM 20306 T sharing 53 % and 52 % identities, respectively. FAS I (A3ECDRAFT_2083) is a single polypeptide of 3055 amino acid residues, which contained all the catalytic domains necessary to perform the iterative series of reactions for de novo fatty acids synthesis. The individual component enzymes of the various catalytic domains are acyl transferase (AT), enoyl reductase (ER), β-hydroxyacyl dehydratase (DH), malonyl/palmitoyl transferase (MPT), acyl carrier protein (ACP), β-ketoacyl reductase (KR), and β-ketoacyl synthase (KS) [53].
In addition to the fas1 gene, genes encoding for the putative subunits of acetyl-CoA carboxylase were found: one gene encoding for biotin carboxylase BC (α subunit) ([EC:6.3.4.14], A3ECDRAFT_2085) and the other encoding for carboxyltransferase CT (β subunit) (A3ECDRAFT_2084). Acetyl-CoA carboxylase catalyzes the biotin-dependent carboxylation of acetyl-CoA to produce malonyl-CoA in the first committed step of the fatty acid biosynthesis pathway. Malonyl-CoA is then made available to be utilized by the multifunctional type I FAS for de novo biosynthesis of fatty acids. FAS I synthesizes both saturated (C 16:0 and C 18:0 ) and monounsaturated (C 18:1 ω9c) fatty acids [54]. In C. ulceribovis the results of cellular fatty acids analysis are in agreement with the functional characteristics of FAS I.

Fatty acid catabolism
For the catabolism of fatty acids, 16 genes encoding for proteins predicted to be involved in the β-oxidation pathway of fatty acid degradation were identified. These include: four fadE genes encoding for acyl-CoA dehydrogenase (  [55,56]. The subsequent detoxification of the resulting H 2 O 2 is catalyzed by catalase ([EC:1.11.1.6]; A3ECDRAFT_0111) encoded by the katA gene of C. ulceribovis. The existence of considerable set of genes putatively involved in β-oxidation, suggested the ability of C. ulceribovis to mobilize the energy and carbon stored in fatty acids with different chain-lengths.

Corynomycolic acid biosynthesis and processing
Mycolic acids, long-chain α-alkyl, β-hydroxy fatty acids, are major components of the cell wall of several genera of Corynebacterineae. They are found either covalently linked to the cell wall arabinogalactan, to form mycolyl arabinogalactan, or acylated to trehalose units to form trehalose monomycolate (TMM) and trehalose dimycolate (TDM) [57][58][59]. Mycolic acids covalently linked to the cell wall form a hydrophobic permeability barrier, also referred to as the mycomembrane, which contributes to the low permeability of the envelope of Corynebacterineae and the natural resistance of these microorganisms to various antibiotics [45,60,61]. Mycolic acids vary in size and complexity within the different genera of Corynebacterineae. Members of the genus Corynebacterium are characterized by producing short-chain C22 to C36 mycolic acids, also called corynomycolic acids, with simple chemical structure [57].
Examination of the genome of C. ulceribovis DSM 45146 T revealed the presence of homologs of genes encoding for proteins with known functions in the pathway of mycolic acids biosynthesis, processing and subsequent transport for deposition in the cell wall. These genes comprising: accD3 encoding for an acyl-CoA carboxylase complex (A3ECDRAFT_1931), which catalyzes the carboxylation of palmitoyl-CoA to yield carboxylated intermediate [62][63][64]; fadD32 encoding for an acyl-CoA synthetase/AMP ligase FadD32 (A3ECDRAFT_1933), which catalyzes the activation of the meromycolate chain through the formation of meroacyl-ADP before transfer to the polyketide synthase [64,65]; pks13 encoding for a polyketide synthase (A3ECDRAFT_1932) that performs the condensation of two fatty acids to form a 2-alkyl-3keto mycolate precursor [66]; elrF encoding for the envelope lipid regulation factor ElrF (A3ECDRAFT_1934), _0314, _0659, _0660, _0935), which plays a role in the regulation of mycolic acid compositions in response to thermal variation in the environment [67]; cmtA, cmtC and cmtB encoding for trehalose mycolyltransferases (A3ECDRAFT_0077), (A3ECDRAFT_1936) and (A3EC DRAFT_1937), respectively, which catalyze: a) the transfer of mycolyl residue onto trehalose, thereby generating TMM, b) the transfer of one molecule TMM to another TMM leading to the formation of TDM, and c) the transfer of mycolate from TMM to arabinogalactan, forming the cell wall arabinogalactan-mycolate polymer [68][69][70]; mmpL encoding for membrane transport proteins of the MmpL family (A3ECDRAFT_0066, A3EC DRAFT_1927, A3ECDRAFT_2155) which is involved in the translocation of TMM to the outside of the bacterial cell for subsequent use as substrate for cell wall mycolylation [71]; and cmrA encoding for short-chain dehydrogenase/reductase CmrA (A3ECDRAFT_1367), the enzyme catalyzes the reduction of the mycolate precursor to produce the mature trehalose mycolates and subsequent covalent attachment onto the cell wall [72]. These genes clustered together forming a locus in the chromosome (Fig. 4). The overall organization of the entire locus in all mycolic acid-containing Actinobacteria is almost identical, although a slight difference is apparent in the mycolyltransferase region (Fig. 4). This gene repertoire is consistent with the detection of mycolic acids in the cell envelope of C. ulceribovis DSM 45146 T by thin-layer chromatography [1].
Although the pgsA paralogs (A3ECDRAFT_0785, A3E CDRAFT_0837 and A3ECDRAFT_1077) were annotated as PgsA/CDP-diacylglycerol-glycerol-3-phosphate 3-phosphatidyltransferase, it seems likely that they are functionally not related. The pgsA (A3ECDRAFT_0837) genomic region in C. ulceribovis showed an organization similar to that found in other bacteria (Fig. 5). In all these groups pgsA is the second gene of a cluster of four to five genes potentially organized as an operon. The first ORF of this cluster (A3ECDRAFT_0836) located upstream of the pgsA gene encoded a protein of unknown function. The third ORF (A3ECDRAFT_0838) located downstream of the pgsA gene encoded a protein with similarities to bacterial acyltransferases (showed 53 % identity with homolog Rv2611c in M. tuberculosis H37Rv). The fourth ORF (A3ECDRAFT_0839) encoded a putative α-mannosyltransferase PimA (showed 49 % identity with homolog Rv2610c in M. tuberculosis H37Rv). Genetic evidences have showed that the pimA ortholog (RV2610c) in M. tuberculosis H37Rv encoded an essential enzyme for mycobacterial growth that initiates the biosynthetic pathway of PIMs [86,87]. Therefore in C. ulceribovis, the presence of pgsA (A3E CDRAFT_0837) and pimA (A3ECDRAFT_0839) genes together within a cluster of genes suggested that PgsA (A3ECDRAFT_0837) may be a phosphatidylinositol synthase involved in PI biosynthesis which could be mannosylated by PimA (A3ECDRAFT_0839) leading to the synthesis of PIM. However, experimental verification of the function of the protein (A3ECDRAFT_0837) remains to be performed.   [89]. The murCDEFG genes were organized in cluster located in the center of conserved dcw (division cell wall) region in the order shown in (Fig. 6), whereas the murABI genes were located elswhere in the chromosome. meso-DAP, the third residue in the PG pentapeptide [90], is an important chemotaxonomic marker of members of the Corynebacterineae including the genus Corynebacterium and it is essential for both peptidoglycan and lysine biosynthesis in bacteria. From genome sequencing data, it was clear that C. ulceribovis should synthesize meso-DAP from aspartate via the dehydrogenase variant of the DAP-pathway [91,92]. The genes Moreover, C. ulceribovis genome contains two ldt genes encoding two L, D-transpeptidases (Ldt), LdT1 (A3ECDRAFT_1351) and LdT2 (A3ECDRAFT_1870). The L,D-transpeptidases are a group of carbapenem sensitive enzymes that participate in the remodeling of the peptidoglycan network by formation of 3 → 3 crosslinks between two adjacent meso-DAP residues (meso-Dap → meso-Dap bridges) instead of the 4 → 3 crosslinks (D-Ala → meso-DAP) generated by the D,D-transpeptidase activity of the PBPs and can thus render the peptidoglycan resistant to the hydrolytic activity of endopeptidases [94,95].

Cofactor biosynthesis
Organic cofactors play crucial roles in the catalysis of biochemical reactions in the metabolism of all living organisms. Inspection of C. ulceribovis DSM 45146 T genome revealed the expression of enzymes involved in the de novo biosynthetic pathways for several cofactors such as pyridoxal-5-phosphate, lipoic acid, flavin nucleotides, folate, pantothenate, thiamine, nicotinic acids biotin and menaquinones.
The supplied additional files give an overview of the de novo biosynthetic and salvage pathways for some of these cofactors.

Folic acid (Vitamin B9) biosynthesis
Genes encoding for all the enzymes of the folate biosynthetic pathways are present (Additional file 2). The first enzyme of the pterin branch is GTP cyclohydrolase (FolE), which catalyzes the conversion of GTP to 7,8dihydroneopterin triphosphate [98], which is converted to the corresponding monophosphate by alkaline phosphatase D [EC:3.1.3.1]. The three genes folBKP, which encode the three enzymes dihydroneopterin aldolase, 2amino-4-hydrox-6-hydroxymethyldihydropteridine diphosphate and dihydropteroate synthase, respectively, formed an operon. The three enzymes catalyze the stepwise conversion of dihydroneopterin to 7, 8

Pantothenic acid (Vitamin B 5 ) and coenzyme A (CoA) biosynthesis
Like other bacteria, C. ulceribovis synthesizes coenzyme A (CoA) via pantothenic acid from aspartate and α-ketoisovalerate (Additional file 3). The CoA biosynthetic route requires nine enzymes: four to synthesize pantothenic acid I-VI) and five to produce CoA (VI-XI). With the exception of the gene encoding for 2dehydropantoate 2-reductase PanE (EC1.1.1.169), which catalyzes the reduction of 2-dehydropantoate (IV), all pantothenate and CoA biosynthesis genes are annotated in C. ulceribovis. Although the genome lacks the panE gene encoding for 2-dehydropantoate 2-reductase (KPR), a gene (A3ECDRAFT_1818) encoded for a predicted oxidoreductase, which contains short-chain dehydrogenase (SDR) and DUF2520 domains, was present in the genome. BLASTP analysis revealed that the protein (A3ECDRAFT_1818) was 41 % identical to ketopantoate reductase (PanE/ApbA) in Corynebacterium durum F0235. Homologs of this KPR protein are present in other bacterisa such as Enterococcus faecalis V583 (EF1861), Francisella novicidia (FTT1388) and Clostridium difficile. The KPR protein has been shown to also catalyze the conversion of 2-dehydropantoate to pantoate in Francisella species [99].
Nicotinic acid (Vitamin B3) and nicotinamide adenine dinucleotide NAD biosynthesis NAD and its reduced and phosphorylated derivatives, NADH, NADP and NADPH, function as reducing equivalents for cellular biochemistry and energy metabolism. The genome of C. ulceribovis 45146 T carries the genes encoding for enzymes involved in NAD biosynthesis via both the canonical de novo pathway from L-aspartate and the salvage biosynthetic pathway from nicotinamide. In the de novo pathway, nicotinic acid mononucleotide (NaMN) is synthesized in three enzymatic steps from Laspartate followed by two enzymatic steps to complete the synthesis of NAD (Additional file 5). In the salvage biosynthesis, nicotinamide is converted in a four-step pathway through nicotinate, nicotinate D-ribonucleotide and deamino NAD + to intact NAD + (Additional file 5).

Biotin (Vitamin H) biosynthesis
Biotin is an essential cofactor for biotin-dependent carboxylases, which catalyze the transfer of a carboxylate group from a donor to an acceptor molecule [102]. Biotin synthesis can be subdivided into the synthesis of pimeloyl-CoA from pimelic acid followed by the biotin ring assembly [103]. The bioA-bioD and bioB genes encoding for the enzymes involved in the biotin ring assembly were identified in C. ulceribovis DSM 45146 T genome. However, the pathway of biotin biosynthesis in C. ulceribovis DSM 45146 T is incomplete due to the lack of at least of the bioF and bioW genes. Moreover, the genome contains the bioY-bioM-bioN genes encoding for the protein components BioY (A3ECDRAFT_0764) -BioM (A3ECDRAFT_0763) -BioN (A3ECDRAFT_0762), which constitute tripartite biotin transporter [104]. The birA gene encoding for the BirA protein was also identified.

Menaquinone (Vitamin K2)
Menaquinone (MK) plays a key role as an electron carrier in the electron transport of the respiratory chain in prokaryotes [105]. The genome of C. ulceribovis is also equipped with the genes for the biosynthetic pathway of menaquinone from chorismate. In this pathway chorismate is converted into 1,4-dihydroxy-2-naphthoate (DHNA) via isochorismate by five enzymes encoded by the menFDCEB genes. DHNA is converted to MK after prenylation (catalyzed by MenA) and methylation (catalyzed by MenG). Since menaquinones are the only type of isoprenoid quinones found in the genera of the suborder Corynebacterineae, including the genus Corynebacterium, the presence of genes encoding for enzymes catalyzing the biosynthesis of menaquinone in the genome of C. ulceribovis DSM 45146 T is consistent with its classification in the genus Corynebacterium. Menaquinones are widely used as chemotaxonomic markers. The taxonomic value of menaquinones lies on their chain length and degree of unsaturation [106].

CRISPR/Cas system and immunity to phage attack
Analysis of the genome sequence revealed that C. ulceribovis employs various defense mechanisms to overcome phage infections. These include restrisction of penetrating phage DNA (restriction-modification (R-M) system), abortive phage infection (Abi) system, and the clustered regularly interspaced short palindromic repeats (CRISPR)associated (Cas) proteins (CRISPR/Cas) system.
The genome contains six paralogs of the hsd genes encoding for type I R-M enzymes. These include two hdsR paralogs encoding for two R subunits of type I restriction enzyme HsdR ( A3ECDRAFT_1188 and A3ECDRAFT_1676). The HsdR subunit is responsible for restriction, the HsdM subunit is involved in modification and the HsdS subunit is responsible for specific sequence recognition. None of them reveals any activity as a single protein [107]. For modification activity, a combination of one HsdS and two HsdM subunits is required and for restriction activity all subunits are absolutely required in a stoichiometric ratio of R 2 M 2 S 1 [107]. The M 2 S 1 multifunctional enzyme acts as protective methyltransferase [108], whereas the holoenzyme exhibits both endonucleolytic and helicase activities. The principal function of the R-M system is to protect the bacterial cell against invading DNA, including viruses [109].
In addition, a gene (A3ECDRAFT_0290) encoding a protein annotated as Abi-like protein was identified. This protein contains an Abi_2 domain (pfam07751), which has been shown to mediate bacteriophage resistance by abortive infection [110]. Activation of Abi protein limits phage replication within a bacterial population and promotes bacterial cell death [111,112].
Moreover, C. ulceribovis DSM 45146 T genome contains two CRISPRs loci together with the associated cas genes. CRISPR locus 1 contains 1070 bp and harbors 17 spacer sequences and is not specified by the presence of cas genes in the direct proximity. CRISPR locus 2 contains 6893 bp and harbors 102 spacer sequences and is flanked by seven cas genes [cas3 (A3ECDRAFT_1586), cse1 (A3ECDRAFT_1587), cse2 (A3ECDRAFT_1588), cas7 (A3ECDRAFT_1589), cas5 (A3ECDRAFT_1590), cas6 (A3ECDRAFT_1591) and cas1 (A3ECDRAFT_1592)]. The consesus sequences of the direct repeats of the two CRISPR regions are identical having a length of 28 bp (GTGTTCCCCGCGCAGGCGGGGATGAGCC) and separated by spacers with variable nucleotide sequences. CRISPRs provide the cell with aquired immunity to protect against bacteriophages, plasmids and other mobile genetic elements by a RNA interference-like mechanism [113,114].

Insight into protein secretion systems
Secreted proteins play essential roles in bacteria, including the colonization of niches and host-pathogen interactions [115,116]. In Gram-positive bacteria proteins secretion is mediated mainly by the general secretory (Sec) and the twin-arginine translocation (Tat) pathways. Some Gram-positive bacteria e.g. mycobacteria, nocardia and corynebacteria have a specialized type VII secretion system (T7SS) for secretion of WXG100 family proteins.
Inspection of C. ulceribovis DSM 45146 T genome revealed the presence of all genes encoding proteins for the Sec translocation system. These include proteins forming the main membrane channel-forming complex SecYEG (A3ECDRAFT_0227/A3ECDRAFT_0157/A3ECDRAFT_ 0912), the cytosolic ATPase SecA (A3ECDRAFT_0372 and A3ECDRAFT_1078), the auxiliary proteins SecD (A3ECDRAFT_0848), SecF (A3ECDRAFT_0849) and YajC (A3ECDRAFT_0847), and the chaperones Ffh (A3EC DRAFT_0690) and FtsY (A3ECDRAFT_0689). As in other Gram-positive bacteria, the genome lacks homologs of the SecB protein, the chaperone that targets protein to the Sec translocon for passage through the cytoplasmic membranes. Genes encoding for the twin-arginine translocase (Tat) system, tatA/E, tatB, tatC, and tatD were also present in the genome. Like the majority of other sequenced actinobacterial genomes, the tatA/E gene (A3ECDRAFT_0977) was found next to tatC (A3ECDRAFT_0978), while the tatB gene (A3ECDRAFT_0538) and the tatD gene (A3E CDRAFT_1228) were separately located. The distinguishing feature of the TAT system is its ability to translocate fully folded proteins across the cytoplasmic membrane using the transmembrane proton gradient as the main driving force for translocation [117].
A putative type IVb pilus-encoding gene cluster, similar to the tad (tight adherence) locus in Haemophilus actinomycetemcomitans, was identified in the genome of C. ulceribovis. The genes of this tad locus appear to be organized as two adjacent clusters. The first cluster contained four genes encoding for: homolog of the TadZ protein (A3ECDRAFT_0049), followed by the TadA protein (A3ECDRAFT_0050), followed by two integral membrane proteins, TadB (A3ECDRAFT_0051) and TadC (A3ECDRAFT_0052). The second cluster contained three genes encoding for: a low-molecular weight protein (68 aa) containing DUF4244 domain (A3ECDRAFT_ 0053), followed by an unkown protein (A3ECDRAFT_ 0054), followed by a TadE-like protein (A3ECDRAFT_ 0055). Not linked to the tad locus, a gene encoding a putative prepilin peptidase PilD (A3ECDRAFT_0873), which was found located distantly in the genome. The Fig. 7 Comparison of gene clusters that encode type VII secretion system (T7SS also ESX) in C. ulceribovis DSM 45146 T and variants that are present in other mycolic acid-containing taxa of the Corynebacterineae. Six genes encoding for six proteins are generally present in all the examined species. These proteins are: two members of the ESAT-6 family (Esat-6 and CFP-10); a member of the FtsK/SpoIIIE family (EccCab); a subtilisin-like protease (MycP); an integral membrane protein with 10-11 transmembrane domains (EccD); a member of another membrane-protein family (EccB); In addition, two proteins the PE (proline-glutamine) and PPE (proline-proline-glutamine) encoded by two genes are shared by some, but not all T7SS systems. Orthologs are shown by matching colors tad export apparatus facilitates the export and assembly of pili, which mediate the nonspecific adhesion of bacteria to surfaces and are essential for host colonization and pathogenesis [118][119][120].

Conclusions
The availability of high-quality genome sequence from C. ulceribovis provided crucial insights into the broad biological functions of this organism. Genome analysis showed that the overall features of C. ulceribovis are similar to those of the genus Corynebacterium; it possesses a complete set of peptidoglycan biosynthesis genes, synthesizes meso-DAP from aspartate via the dehydrogenase pathway, possesses all genes for menaquinone biosynthesis from corismate and has complete set of genes for the biosynthesis and processing of mycolic acids. C. ulceribovis also possesses a single fas1 gene encoding type I fatty acid synthase FAS I for de novo fatty acids biosynthesis and a complete set of genes associated with fatty acid degradation by the β-oxidation pathway. Genes encoding enzymes associated with the central carbohydrates metabolism were identified. C. ulceribovis possesses a complete TCA cycle and glyoxylate shunt; a functional PPP for generation of pentoses and NADPH for anabolic purposes; all gene necessary for glycogen metabolism; trehalose synthesis via the OtsA-OtsB pathway. The genome also contains genes encoding myo-inositol-3-phosphate synthase and inositol monophosphatase involved in the biosynthesis of myo-inositol from glucose-6-phosphate as well as gene encoding for αmannosyltransferase PimA leading to the synthesis of PIM. To meet cofactor requirements, several genes encoding for enzymes that catalyze de novo biosynthetic pathways for several cofactors are present in the genome. Finally the genome of C. ulceribovis harbors genes encoding proteins that protect the cells against the danger of bacteriophage infections. These include type I restriction enzymes (R-M enzymes), Abi-like protein that mediate bacteriophage resistance by abortive infection (Abi system) and CRISPER/cas system that serve as molecular "vaccination cards".

Competing Interests
The authors declare that they have no competing interests.
Authors' Contributions NCK conceived the study, oversaw the project, and analyzed data. AFY performed the phenotypic and phylogenetic characterizations of the organism, wrote the manuscript and prepared the figures. All authors read and approved the final manuscript.