Draft genome sequence of marine-derived Streptomyces sp. TP-A0598, a producer of anti-MRSA antibiotic lydicamycins

Streptomyces sp. TP-A0598, isolated from seawater, produces lydicamycin, structurally unique type I polyketide bearing two nitrogen-containing five-membered rings, and four congeners TPU-0037-A, −B, −C, and –D. We herein report the 8 Mb draft genome sequence of this strain, together with classification and features of the organism and generation, annotation and analysis of the genome sequence. The genome encodes 7,240 putative ORFs, of which 4,450 ORFs were assigned with COG categories. Also, 66 tRNA genes and one rRNA operon were identified. The genome contains eight gene clusters involved in the production of polyketides and nonribosomal peptides. Among them, a PKS/NRPS gene cluster was assigned to be responsible for lydicamycin biosynthesis and a plausible biosynthetic pathway was proposed on the basis of gene function prediction. This genome sequence data will facilitate to probe the potential of secondary metabolism in marine-derived Streptomyces.


Introduction
Members of the genus Streptomyces, Gram-positive filamentous actinomycetes, are an attractive source for bioactive secondary metabolites. Terrestrial surface soil is the most common habitat for Streptomyces but a recent survey has disclosed its ubiquitous distribution in marine environments. Marine Streptomyces are currently attracting much attention as an untouched resource of novel bioactive compounds useful for drug development [1][2][3]. In our screening for new anti-MRSA antibiotics, Streptomyces sp. TP-A0598 collected from deep sea water was found to produce lydicamycin and its four new congeners of polyketide origin (Fig. 1) [4]. Lydicamyicn is characterized by the unprecedented pyrrolidine ring modified by an aminoiminomethyl group to which a polyketide-derived carbon chain with multiple hydroxyl and olefinic functionalities is linked and to the other end of the chain is linked an octalin modified by a tetramic acid. Despite this unique structural feature, biosynthetic genes of lydicamycin have not been reported to date. In this study, we conducted whole genome shotgun sequencing of the strain TP-A0598 to identify the PKS gene cluster for lydicamycin. We herein present the draft genome sequence of Streptomyces sp. TP-A0598, together with the description of genome properties and annotation for secondary metabolite genes. The putative lydicamycin biosynthetic gene cluster and a plausible biosynthetic pathway are also reported.

Classification and features
In the course of screening for new bioactive molecules produced by marine microorganisms, Streptomyces sp. TP-A0598 was isolated from a seawater sample collected in 2,600 meters off the shore and 321 meters in depth at Namerikawa, Toyama, Japan by a membrane filter method and found to produce lydicamycin and its novel congeners. This strain grew well on Bennett's, ISP 3, ISP 4, ISP 5 and Yeast starch agars. On ISP 5, ISP 6 and ISP 7 agars, the growth was poor. The color of aerial mycelia was grayish olive and that of the reverse side was pale yellow on ISP 3 agar. Diffusible pigments were not formed on any agar media that we examined. Strain TP-A0598 formed spiral spore chains and the spores were cylindrical, 0.5 × 0.9 μm in size, having a warty surface [4]. A scanning electron micrograph of this strain is shown in Fig. 2. Growth occurred at 15-37°C (optimum 30°C) and pH 5-9 (optimum pH 7). Strain TP-A0598 exhibited growth with 0-7 % (w/v) NaCl (optimum 0 % NaCl). Strain TP-A0598 utilized D-glucose, sucrose, inositol, L-rhamnose, D-mannitol, D-raffinose, D-fructose, L-arabinose, and D-xylose for growth (Table 1) [4]. This strain was deposited in the NBRC culture collection with the registration number of NBRC 110027. The genes encoding 16S rRNA were amplified by PCR using two universal primers, 9 F and 1541R. After purification of the PCR product by AMPure (Beckman Coulter), the sequencing was carried out according to a established methods [5]. Homology search of the sequence by EzTaxon-e [6]   sequence together with phylogenetic neighbors that showed over 98.5 % similarity (Fig. 3) using ClustalX2 [8] and NJplot [9]. The phylogenetic analysis confirmed that the strain TP-A0598 belongs to the genus Streptomyces.

Genome project history
In collaboration between Toyama Prefectural University and NBRC, the organism was selected for genome sequencing to elucidate the lydicamycin biosynthetic gene cluster. We successfully accomplished the genome project of Streptomyces sp. TP-A0598 as reported in this paper. The draft genome sequence data have been deposited in the INSDC database under the accession number BBNO01000001-BBNO01000020. The project information and its association with MIGS version 2.0 compliance are summarized in Table 2 [10].

Growth conditions and genomic DNA preparation
Streptomyces sp. TP-A0598 monoisolate was grown on polycarbonate membrane filter (Advantec) on double diluted ISP 2 agar medium (0.2 % yeast extract, 0.5 % malt extract, 0.2 % glucose, 2 % agar, pH 7.3) at 28°C. High quality genomic DNA for sequencing was isolated from the mycelia with an EZ1 DNA Tissue Kit and a Bio Robot EZ1 (Qiagen) according to the protocol for extraction of nucleic acid from Gram-positive bacteria. The size, purity, and double-strand DNA concentration of the genomic DNA were measured by pulsed-field gel electrophoresis, ratio of absorbance values at 260 nm and 280 nm, and Quant-iT PicoGreen dsDNA Assay Kit (Life Technologies) to assess the quality.

Genome sequencing and assembly
Shotgun and pair-end libraries were prepared and sequenced using 454 pyrosequencing technology and HiSeq1000 (Illumina) pair-end technology, respectively ( Table 2). The 70 Mb shotgun sequences and 702 Mb pair-end sequences were assembled into 20 scaffolds larger than 500 bp using Newbler v2.6, and subsequently finished using GenoFinisher [11].

Genome annotation
Coding sequences were predicted by Prodigal [12] and tRNA-scanSE [13]. The gene functions were annotated using an in-house genome annotation pipeline and domains related to PKS and NRPS were searched for using the SMART and PFAM domain databases. PKS and NRPS gene clusters and their domain organizations were , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are the Gene Ontology project [26] analyzed manually. Similarity search in the NCBI nr databases was also used for functional prediction of genes in the lydicamycin biosynthetic gene cluster.

Genome properties
The total size of the genome is 8,319,549 bp and the GC content is 71.0 % (Table 3), similar to other genome-sequenced Streptomyces members. Of the total 7,344 genes, 7,240 are protein-coding genes and 75 are RNA genes. The classification of genes into COGs functional categories is shown in Table 4. As for the secondary metabolism, Streptomyces sp. TP-A0598 has two type I PKS, two type II PKS, two NRPS, and two hybrid PKS/NRPS gene clusters, suggesting the high capacity of production of polyketides and nonribosomal peptides.

Insights from the genome sequence
The chemical structure of lydicamycin ( Fig. 1) suggests that its carbon skeleton is assembled from eleven malonyl-CoA and six methylmalonyl-CoA precursors by type I PKS pathway. In addition, this pathway should be The tree uses sequences aligned by ClustalX2 [8], and constructed by the neighbor-joining method [27]. All positions containing gaps were eliminated. The building of the tree also involves a bootstrapping process repeated 1000 times to generate a majority consensus tree [28], and only bootstrap values above 50 % are shown at branching points. Kitasatospora setae [29] was used as an outgroup  Fig. 4) consists of seventeen PKS modules and one NRPS module (Fig. 5b). According to the assembly line rule [14], the predicted structure of the polyketide arising from this PKS/NRPS hybrid gene cluster was in good accordance with the actual structure of lydicamycin (Fig. 5b). As a starter unit for the polyketide assembly, 4-guanidinobutyryl CoA could be proposed on the basis of annotation of TPA0598_03_00880, TPA0598_03_00650 and TPA0598_03_00700. These genes were predicted to encode amine oxidase, acyl-CoA ligase, and transacylase by comparing the corresponding genes present in the ECO-02301 biosynthetic gene cluster. In the biosynthesis of ECO-02301, 4-aminobutyryl-CoA is supplied from Larginine by a sequential action of amine oxidase, acyl-CoA ligase, and amidinohydrolase and is transferred to ACP by transacylase (Fig. 5a) [15]. In the lydicamycin cluster, The total is based on the total number of protein coding genes in the genome Genes with signal peptides 653 8.9 Genes with transmembrane helices 1,770 24.1

CRISPR repeats 5 -
genes for an amine oxidase (TPA0598_03_00880), an acyl-CoA ligase (TPA0598_03_00650), and a transacylase (TPA0598_03_00700) are present in the surrounding region of the PKS cluster but an amidinohydrolase gene responsible for the hydrolysis of the guanidine residue to the primary amine is lacking (Fig. 5a, Table 5). After the 4-guanidinobutyryl starter is loaded onto ACP of TPA0598_03_00840, the polyketide chain is extended by eight PKSs and a glycine is added to the polyketide terminus by an NRPS module (Fig. 5b), followed by the formation of an octalin and a tetramic acid ring (Fig. 5c). It was not possible to assign a gene responsible for the  cyclization of the guanidino precursor into a pyrrolidine ring. A cytochrome P450 (TPA0598_03_00850) would be responsible for the hydroxylation of the octalin carbon at C-8 (Fig. 5c). Production of deoxy-and demethylcongeners suggests that substrate recognition by the AT domain in module3 (second module of TPA0598_03_00740) and the ER domain in module11 (first module of TPA0598_03_00780) is likely not strict (Table 6).

Conclusions
The 8 Mb draft genome of Streptomyces sp. TP-A0598, a producer of lydicamycins isolated from seawater, has been deposited at GenBank/ENA/DDBJ under accession number BBNO00000000. We successfully identified the PKS/NRPS hybrid cluster for lydicamycin biosynthesis and proposed a plausible biosynthetic pathway. In addition, the genome of strain TP-A0598 contained seven orphan PKS or NRPS gene cluster but secondary metabolites from these orphan clusters have not been isolated yet. The genome sequence information disclosed in this study will be utilized for the investigation of additional new bioactive compounds from this strain and will also serve as a valuable reference for evaluation of the metabolic potential in marine-derived Streptomyces.
Abbreviations A gly : Adenylation domain whose substrate is glycine; ACP: Acyl carrier protein domain; AT: Acyltransferase domain whose substrate is malonyl-CoA; AT m : AT whose substrate is methylmalonyl-CoA