Complete genome sequence of esterase-producing bacterium Croceicoccus marinus E4A9T

Wu, Yue-Hong; Cheng, Hong; Huo, Ying-Yi; Xu, Lin; Liu, Qian; Wang, Chun-Sheng; Xu, Xue-Wei

doi:10.1186/s40793-017-0300-0

Short genome report
Open access
Published: 21 December 2017

Complete genome sequence of esterase-producing bacterium Croceicoccus marinus E4A9^T

Yue-Hong Wu¹,
Hong Cheng¹,
Ying-Yi Huo¹,
Lin Xu¹,
Qian Liu¹,
Chun-Sheng Wang¹ &
…
Xue-Wei Xu¹

Standards in Genomic Sciences volume 12, Article number: 88 (2017) Cite this article

2513 Accesses
4 Citations
Metrics details

Abstract

Croceicoccus marinus E4A9^Twas isolated from deep-sea sediment collected from the East Pacific polymetallic nodule area. The strain is able to produce esterase, which is widely used in the food, perfume, cosmetic, chemical, agricultural and pharmaceutical industries. Here we describe the characteristics of strain E4A9, including the genome sequence and annotation, presence of esterases, and metabolic pathways of the organism. The genome of strain E4A9^T comprises 4,109,188 bp, with one chromosome (3,001,363 bp) and two large circular plasmids (761,621 bp and 346,204 bp, respectively). Complete genome contains 3653 coding sequences, 48 tRNAs, two operons of 16S–23S-5S rRNA gene and three ncRNAs. Strain E4A9^T encodes 10 genes related to esterase, and three of the esterases (E3, E6 and E10) was successfully cloned and expressed in Escherichia coli Rosetta in a soluble form, revealing its potential application in biotechnological industry. Moreover, the genome provides clues of metabolic pathways of strain E4A9^T, reflecting its adaptations to the ambient environment. The genome sequence of C. marinus E4A9^T now provides the fundamental information for future studies.

Introduction

Lipolytic enzymes, including esterase (EC 3.1.1.1) and lipase (EC 3.1.1.3), are a general class of carboxylic ester hydrolases (EC 3.1.1), which catalyze the hydrolytic cleavage and formation of ester bonds [1, 2]. Esterase shows a preference for water-soluble short chain fatty acids (< 10 carbon atoms), while lipase prefers water-insoluble longer chain fatty acids (> 10 carbon atoms) [3, 4]. Many esterases do not require cofactors and have high stereospecificity toward chemicals, broad substrate specificity and high stability in organic solvents [4]. They are extensively used in the food, perfume, cosmetic, chemical, agricultural and pharmaceutical industries [5].

Croceicoccus [6], as a genus of the family Erythrobacteraceae [7], can be found in the marine environments, including deep-sea sediment, surface seawater and marine biofilm from a boat shell [6, 8, 9]. C. marinus E4A9^T, the type strain of the genus Croceicoccus , was isolated from deep-sea sediment collected from the East Pacific polymetallic nodule area [6]. The strain was able to produce esterase as well as lipase [6]. To get insight into the capability of esterase production, recently, we obtained the complete genome of C. marinus E4A9^T and detected genes of esterase. This is the first genome report for the strain in the genus of Croceicoccus . We also describe the genomic sequencing related to its annotation for understanding their metabolic and ecological functions in the environment.

Organism information

Classification and features

C. marinus E4A9^T was isolated from a deep-sea sediment sample collected from the East Pacific polymetallic nodule area (8°22′38” N, 145°23′56” W) at a depth of 5280 m (temperature 2 °C, salinity 3.4%). Strain E4A9^T was obtained and routinely cultured on marine broth 2216 (MB, BD) at 30 °C. Subsequently polyphasic study of strain E4A9^T was performed. A new species Croceicoccus marinus gen. Nov. sp. nov. was proposed. Strain E4A9^T is the type strain of the species of C. marinus [6], and was deposited into the China General Microbiological Culture Collection (CGMCC 1.6776 ^T).

C. marinus [6] is a valid species belonging to the family Erythrobacteraceae [7], in the order Sphingomonadales [10, 11], class Alphaproteobacteria [11, 12] and phylum Proteobacteria [13] . C. marinus E4A9^T is a Gram-staining-negative and cocci-shaped bacterium (Fig. 1). It grew aerobically and used a series of organic carbon, such as _L-arabinose, _D-cellobiose, _D-galactose and xylose, as sole sources of carbon and energy [6, 8]. Based on phylogenetic analysis of 16S rRNA gene sequence, the strain falls into the cluster comprising the Croceicoccus species with a high bootstrap value (Fig. 2). Interestingly, strain E4A9^T could hydrolyze Tween 20, Tween 80 and tributyrin, indicating the presence of esterase as well as lipase [6]. The API ZYM system also supported the results that esterase (C4) and esterase lipase (C8) activities are present. The general features of strain E4A9^T was summarized in Table 1.

Table 1 Classification and general features of Croceicoccus marinus E4A9^T according to the MIGS recommendations [30]

Full size table

Genome sequencing information

Genome project history

C. marinus E4A9^T [6] was selected for sequencing because it is relevant to genomic sequencing of the whole family of Erythrobacteraceae [7] and esterase production. The complete genome sequence was finished on May 29, 2015. The gap closure and annotation processes were performed by the authors. The GenBank accession number of the genome is CP019602, CP019603 and CP019604. The main genome sequence information is present in Table 2 and Table 3.

Table 2 Genome sequencing project information

Full size table

Table 3 Summary of genome: one chromosome and two plasmids

Full size table

Growth conditions and genomic DNA preparation

C. marinus E4A9^T was aerobically cultivated in Marine Broth (MB, BD Difco™) at 30 °C and stored at −80 °C with 30% (v/v) glycerol. High-quality genomic DNA was extracted using the Qiagen DNA extraction kit, according to its protocol.

Genome sequencing and assembly

The genome of strain E4A9^T was sequenced using SMRT technology with a PacBio RS II platform (Zhejiang Tianke Co. Ltd., China). One library was constructed with 10 kb insert size according to the large SMRTbell gDNA protocol (Pacific Biosciences, USA). The sequencing generated 85,372 reads with an average length of 11,938 nt (972 Mb, 248-fold genome coverage). The de novo assembly of the reads was performed using HGAP Assembly version 2 (Pacific Biosciences, USA). The circularization of final contigs was checked and the overlapping ends were trimmed.

Genome annotation

The rRNA genes were found via RNAmmer 1.2 Server [14] and tRNA genes were identified using tRNAscan-SE 2.0 online server [15]. The open reading frames (ORFs) and the functional annotation of translated ORFs were performed using the RAST server online [16] and GeneMarkS+. Classification of some predicted genes were analyzed using COG database [17] and Pfam [18]. Genes with signal peptides were predicted using SignalIP 4.1 Server [19]. Genes with transmembrane helices were performed using TMHMM Server v. 2.0 [20]. The clustered regularly interspaced short palindromic repeats structures of the genomes were searched by CRISPRfiner program online [21]. Translated genes were assigned to Kyoto Encyclopedia of Genes and Genomes pathway using KEGG automatic annotation server with BBH method [22, 23]. The circular map of chromosome and plasmids were obtained using a CG View online server [24].

Genome properties

The general features of strain E4A9 information are displayed in Table 1 and Table 2. The complete genome comprises 4,109,188 bp, with one chromosome (3,001,363 bp) and two large circular plasmids (plasmid pCME4A9I, 761,621 bp and plasmid pCME4A9II, 346,204 bp, respectively) (Fig. 3). The G + C content was 64.5 mol%. The genome of strain E4A9 contains 3653 coding sequences (CDSs), 48 tRNAs, two operons of 16S–23S-5S rRNA gene and three ncRNAs. Among the genes, 132 were assigned to pseudogene. The summary of features and statistics of the genome is shown in Table 4 and genes belonging to COG functional categories are listed in Table 5.

Table 4 Genome statistics of Croceicoccus marinus E4A9^T

Full size table

Table 5 Number of genes associated with general COG functional categories

Full size table

Three replicons of the genome of strain E4A9, located in a circular chromosome and two large plasmids, were detected. Two plasmid replication initiator protein genes (ARU17925 and ARU18299) were found in the two plasmid sequence respectively, indicating that the genome of strain E4A9 contains two large circular plasmids. The G + C content of the two plasmids (63.5 mol% and 60.7 mol%, respectively) was a litter lower than that of the chromosome (65.2 mol%). The two plasmids have high gene density with 702 and 303 protein-coding regions, respectively. Many unsuspected genes involved in metabolism of aromatic compounds were identified in plasmid pCME4A9I. Almost 10% of the plasmid pCME4A9II sequence carries genes encoding gene of subsystem feature virulence, disease and defense, and most of them were of the copper homeostasis and cobalt-zinc-cadmium resistance. The functions of these genes are consistent with the notion that the two plasmids play an important role in the adaption of the bacteria in the sediment environment.

Insights from the genome sequence

Esterases presence of C. marinus E4A9^T

The presence of genes for the biotechnologically important enzymes like lipolytic enzymes were also predicted. Ten novel esterases were predicated (Fig. 4), and their amino acid sequences shared 58% to 85% identities to those of other lipolytic enzymes in the database. Phylogenetic analysis showed that predicated esterases E3 and E6 were grouped into family VII lipolytic enzymes and E10 was grouped into family II lipolytic enzymes. In order to investigate the biochemical properties of the esterases (E3, E6 and E10), recombinant plasmids were constructed and expressed in Escherichia Coli [25, 26]. After incubation of recombinant colonies for 48 h on the plate (Luria-Bertani agar medium) supplemented with 1% tributyrin, the three recombinant colonies had clear zones around the colonies. It indicated the presence of lipolytic activity. The calculated molecular weight of E3, E6 and E10 was 55.9, 46.1 and 22.4 kDa, respectively. The recombinant protein was soluble and purified using a Ni-NTA affinity chromatography column. The activity of purified E3, E6 and E10 was examined using p-nitrophenyl butyrate as substrate, and they had specific activities under standard reaction conditions (data not shown).

Metabolism of C. marinus E4A9^T

The complete genome of C. marinus E4A9^T was annotated for understanding the metabolic potentials based on the key genes of metabolic pathways of carbon, nitrogen, sulfur and phosphorus. (i) Carbon metabolism. The genome of strain E4A9^T is lack of carbon fixation and CO-oxidizing (cox) genes, indicating that the strain is not able to grow autotrophically. Strain E4A9^T can use organic carbon sources (Table 1). The genome has a complete glycolysis pathway (Embden-Meyerhoff-Parnas pathway). In addition, it possesses key genes of the Entener-Doudoroff pathway, the pentose phosphate pathway, and the tricarboxylic acid cycle. (ii) Nitrogen metabolism. The genome of C. marinus E4A9^T possesses ammonium transporter genes and amino acids transporter genes (e.g. methionine and L-proline/glycine betaine). Genes encoding enzymes involved in polyamines biosynthesis are present, but the lack of polyamines transporters suggests its incapability of utilizing extracellular polyamines. Nitrate and nitrite transporters have been found in the genome of strain E4A9. It processes genes involved in nitrate and nitrite reduction (nasAB and nirBD, respectively) and is lack of genes involved in denitrification, nitrogen fixation and anammox. Thus, nitrate and nitrite could act as electron acceptors to generate ammonium, subsequently being utilized by strain E4A9 as a reduced nitrogen source. The genome of C. marinus E4A9^T is lack of urease (ureABC); however it harbors genes involved in urea decomposition, including urea carboxylase-related ABC transporter, urea carboxylase-related aminomethyltransferase, urea carboxylase and allophanate hydrolase, suggesting its capability of utilizing urea as a C or N source in the environment [27]. (iii) Sulfur metabolism. Strain E4A9^T possesses genes involved in assimilatory sulfate reduction (e.g. cysND, cysC, cysH, cysJI). Sulfate can be reduced to sulfide, subsequently being incorporated into amino acids. Genes involving in alkanesulfonate assimilation (arylsulfatase and FMN reductase) are present in the genome of strain E4A9, suggesting its capability of utilizing organic sulfur compounds. However, it missed transporter genes for the uptake of extracellular alkanesulfonates. (iv) Phosphorus metabolism. Strain E4A9 is lack of genes for inorganic P storage as polyphosphate (ppk), as well as transport (phnCDE) and cleavage (phnGHIJKLN) of organic P in the form of phosphonates [28]. While strain E4A9 possesses the high-affinity phosphate transport system (pstSCAB) and regulatory genes (phoUBR), indicating an alternative strategy for maintaining a reliable supply of phosphorus [29].

Conclusions

The complete genome sequence of C. marinus E4A9^T contains a circular chromosome as well as two large circular plasmids and provides an insight into the genomic basis of its esterases production ability. Our data implies C. marinus E4A9^T is a potential candidate in biotechnological application and facilitates the understanding for further industrial and biotechnological applications of esterases.

Abbreviations

CDS:: Coding sequence
CRISPRs:: Clustered regularly interspaced short palindromic repeats
KAAS:: KEGG automatic annotation server
KEGG:: Kyoto encyclopedia of genes and genomes
ORF:: Open reading frame

References

Jiang X, Xu X, Huo Y, Wu Y, Zhu X, Zhang X, et al. Identification and characterization of novel esterases from a deep-sea sediment metagenome. Arch Microbiol. 2012;194:207–14.
Article CAS PubMed Google Scholar
Lopez-Lopez O, Cerdan ME, Gonzalez Siso MI. New extremophilic lipases and esterases from metagenomics. Curr Protein Pept Sci. 2014;15:445–55.
Article CAS PubMed PubMed Central Google Scholar
Arpigny JL, Jaeger KE. Bacterial lipolytic enzymes: classification and properties. Biochem J. 1999;343:177–83.
Article CAS PubMed PubMed Central Google Scholar
Bornscheuer UT. Microbial carboxyl esterases: classification, properties and application in biocatalysis. FEMS Microbiol Rev. 2002;26:73–81.
Article CAS PubMed Google Scholar
Yang S, Qin Z, Duan X, Yan Q, Jiang Z. Structural insights into the substrate specificity of two esterases from the thermophilic Rhizomucor miehei. J Lipid Res. 2015;56:1616–24.
Article CAS PubMed PubMed Central Google Scholar
Xu XW, Wu YH, Wang CS, Wang XG, Oren A, Wu M. Croceicoccus marinus gen. Nov., sp. nov., a yellow-pigmented bacterium from deep-sea sediment, and emended description of the family Erythrobacteraceae. Int J Syst Evol Microbiol. 2009;59:2247–53.
Article CAS PubMed Google Scholar
Lee K-B, Liu C-T, Anzai Y, Kim H, Aono T, Oyaizu H. The hierarchical system of the ‘Alphaproteobacteria’: description of Hyphomonadaceae fam. Nov., Xanthobacteraceae fam. Nov. and Erythrobacteraceae fam. Nov. Int J Syst Evol Microbiol. 2005;55:1907–19.
Article CAS PubMed Google Scholar
Huang Y, Zeng Y, Feng H, Wu Y, Xu X. Croceicoccus naphthovorans sp. nov., a polycyclic aromatic hydrocarbons-degrading and acylhomoserine-lactone-producing bacterium isolated from marine biofilm, and emended description of the genus Croceicoccus. Int J Syst Evol Microbiol. 2015;65:1531–6.
Article CAS PubMed Google Scholar
Wu YH, Li GY, Jian SL, Cheng H, Huo YY, Wang CS, et al. Croceicoccus pelagius sp. nov. and Croceicoccus mobilis sp. nov., isolated from marine environments. Int J Syst Evol Microbiol. 2016;66:4506–11.
Article PubMed Google Scholar
Yabuuchi E, Kosako Y. Order IV. Sphingomonadales ord. Nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey's manual of systematic bacteriology, vol. Volume 2, Part C. Second ed. New York: Springer; 2005. p. 230–3.
Google Scholar
Euzéby J. Validation list no. 107. List of new names and new combinations previously effectively, but not validly, published. Int J Syst Evol Microbiol. 2006;56:1–6.
Article Google Scholar
Garrity GM, Bell JA, Lilburn T. Class I. Alphaproteobacteria class. Nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s manual of systematic bacteriology, Second Edi-tion, Volume 2, Part C. New York: Springer; 2005. p. 1.
Garrity GM, Bell JA, Lilburn T. Phylum XIV. Proteobacteria phyl. Nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey's manual of systematic bacteriology, Second Edition, Volume 2, Part B. New York: Springer; 2005. p. 1.
Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35:3100–8.
Article CAS PubMed PubMed Central Google Scholar
Lowe TM, Chan PP. tRNAscan-SE on-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res. 2016;44:W54–W7.
Article CAS PubMed PubMed Central Google Scholar
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. The RAST server: rapid annotations using subsystems technology. BMC Genomics. 2008;9:75.
Article PubMed PubMed Central Google Scholar
Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, et al. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001;29:22–8.
Article CAS PubMed PubMed Central Google Scholar
Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44:D279–85.
Article CAS PubMed Google Scholar
Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Methods. 2011;8:785–6.
Article CAS PubMed Google Scholar
Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001;305:567–80.
Article CAS PubMed Google Scholar
Grissa I, Vergnaud G, Pourcel C. CRISPRFinder: a web tool to identify clustered regularly interspaced short palindromic repeats. Nucleic Acids Res. 2007;35:W52–7.
Article PubMed PubMed Central Google Scholar
Kanehisa M, Araki M, Goto S, Hattori M, Hirakawa M, Itoh M, et al. KEGG for linking genomes to life and the environment. Nucleic Acids Res. 2008;36:D480–4.
Article CAS PubMed Google Scholar
Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: an automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 2007;35:W182–5.
Article PubMed PubMed Central Google Scholar
Stothard P, Wishart DS. Circular genome visualization and exploration using CGView. Bioinformatics. 2005;21:537–9.
Article CAS PubMed Google Scholar
Castellani A, Chalmers AJ. Manual of tropical medicine. 3rd ed. New York: Williams Wood and Co; 1919.
Google Scholar
Skerman VBD, McGowan V, Sneath PHA. Approved lists of bacterial names. Int J Syst Evol Microbiol. 1980;30:225–420.
Article Google Scholar
Kanamori T, Kanou N, Atomi H, Imanaka T. Enzymatic characterization of a prokaryotic urea carboxylase. J Bacteriol. 2004;186:2532–9.
Article CAS PubMed PubMed Central Google Scholar
Moran MA, Belas R, Schell MA, Gonzalez JM, Sun F, Sun S, et al. Ecological genomics of marine Roseobacters. Appl Environ Microbiol. 2007;73:4559–69.
Article CAS PubMed PubMed Central Google Scholar
Liu Q, Wu YH, Cheng H, Xu L, Wang CS, Xu XW. Complete genome sequence of bacteriochlorophyll-synthesizing bacterium Porphyrobacter neustonensis DSM 9434. Stand Genomic Sci. 2017;12:32.
Article PubMed PubMed Central Google Scholar
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol. 2008;26:541–7.
Article CAS PubMed PubMed Central Google Scholar
Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, bacteria, and Eucarya. Proc Natl Acad Sci U S A. 1990;87:4576–9.
Article CAS PubMed PubMed Central Google Scholar
The Gene Ontology C, Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25:25–9.
Article Google Scholar

Download references

Acknowledgments

This work was supported by grants from the National Natural Science Foundation of China (No. 41406174 and 31770004), the National Key Basic Research Program of China (2014CB441503) and the Natural Science Foundation of Zhejiang Province (LR17D060001).

Author information

Authors and Affiliations

Key Laboratory of Marine Ecosystem and Biogeochemistry, Second Institute of Oceanography, State Oceanic Administration, 36th North BaoChu Road, Hangzhou, 310012, China
Yue-Hong Wu, Hong Cheng, Ying-Yi Huo, Lin Xu, Qian Liu, Chun-Sheng Wang & Xue-Wei Xu

Authors

Yue-Hong Wu
View author publications
You can also search for this author in PubMed Google Scholar
Hong Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Ying-Yi Huo
View author publications
You can also search for this author in PubMed Google Scholar
Lin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Qian Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Sheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xue-Wei Xu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XX and CW organized the study. YW and YH performed laboratory experiments. YW, HC and LX analyzed the data. YW drafted the manuscript. XX and QL edited the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xue-Wei Xu.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Wu, YH., Cheng, H., Huo, YY. et al. Complete genome sequence of esterase-producing bacterium Croceicoccus marinus E4A9^T . Stand in Genomic Sci 12, 88 (2017). https://doi.org/10.1186/s40793-017-0300-0

Download citation

Received: 15 October 2017
Accepted: 05 December 2017
Published: 21 December 2017
DOI: https://doi.org/10.1186/s40793-017-0300-0

Complete genome sequence of esterase-producing bacterium Croceicoccus marinus E4A9^T

Abstract

Introduction