An Updated genome annotation for the model marine bacterium Ruegeria pomeroyi DSS-3

When the genome of Ruegeria pomeroyi DSS-3 was published in 2004, it represented the first sequence from a heterotrophic marine bacterium. Over the last ten years, the strain has become a valuable model for understanding the cycling of sulfur and carbon in the ocean. To ensure that this genome remains useful, we have updated 69 genes to incorporate functional annotations based on new experimental data, and improved the identification of 120 protein-coding regions based on proteomic and transcriptomic data. We review the progress made in understanding the biology of R. pomeroyi DSS-3 and list the changes made to the genome.


Introduction
Ruegeria pomeroyi DSS-3 is an important model organism in studies of the physiology and ecology of marine bacteria [1]. It is a genetically tractable strain that has been essential for elucidating bacterial roles in the marine sulfur and carbon cycles [2,3] and the biology and genomics of the marine Roseobacter clade [4], a group that makes up 5-20% of bacteria in ocean surface waters [5,6]. Here we update the R. pomeroyi DSS-3 genome with 189 changes collected from the work of several research groups over the last ten years.

Genome project history
The genome of R. pomeroyi DSS-3 was sequenced in 2003 by The Institute for Genomic Research (now the J. Craig Venter Institute) using Sanger sequencing (Table 2), and was annotated using Glimmer 2 [20] and the TIGR Assembler [21]. The genome was published in 2004 [1].

Genome properties
The R. pomeroyi DSS-3 genome contains a 4,109,437 bp circular chromosome (5 bp shorter than previously reported [1]) and a 491,611 bp circular megaplasmid, with a G + C content of 64.1 (Table 3). A detailed description of the genome is found in the original article [1].
Studies of the R. pomeroyi DSS-3 genome have also provided a better understanding of the genes involved in processing organic nitrogen compounds, such as taurine and N-acetyltaurine [24,31,32]. The bacterium can catabolize lysine by using the saccharopine pathway, which is used by many plants and animals, or by using the lysine dehydrogenase pathway. Under high salt conditions, it preferentially uses the latter pathway, leading to biosynthesis of the osmolyte aminoadipate. The function of several genes in both lysine pathways has recently been experimentally verified [37].
Progress has been made in understanding the mechanisms of metal uptake in R. pomeroyi DSS-3. The manganese uptake regulator mur has been experimentally validated, as have the ABC transporter genes for manganese metabolism (sitABCD) [41]. In total, 69 annotation changes were made based on new experimental data identifying genes responsible for carbon, nitrogen, sulfur, and metal uptake and metabolism [42].
Proteomics [42] and mRNA sequencing have resulted in 120 protein coding regions being identified, removed or corrected in the updated genome. A detailed proteomic study of R. pomeroyi DSS-3 under diverse growth conditions resulted in the identification of 26 novel open reading frames (ORFs) and 5 sequencing errors [42]. The function of most of the new genes is not known and 16 of the expressed polypeptides do not have known homologs. The 26 ORFs missed in the original annotation is a significant number but less than the 1% error rate predicted for Glimmer 2 [20]. The proteomic analysis was also able to correct the start sites of 64 genes [42], enhancing the information that had been obtained only from the DNA sequence [20]. Many of the ORFs identified by proteomics were independently confirmed using strand-specific messenger RNA sequences from continuous cultures [43] and the gene calling software Glimmer 3 [44]. This method also identified several genes that were originally annotated in the wrong orientation, including a novel bacterial collagen gene (SPO1999). A list of genome updates based on these biochemical, genetic, and -omics approaches is provided in Table 4, and full details in Additional file 1: Table S1. The updated annotations have been incorporated into the official genome record at the National Center for Biotechnology Information (Bethesda, MD, USA) under accession numbers CP000031.2 and CP000032.1 and Roseobase (http://roseobase.org).

Conclusion
Ten years after the publication of the Ruegeria pomeroyi DSS-3 genome sequence, advances in knowledge of gene function and structural genome features motivated an annotation update. As an ecologically-relevant heterotrophic marine bacterium that is amenable to laboratory studies and genetic manipulation, R. pomeroyi is serving as a valuable model organism for investigations of the ecology, biochemistry, and biogeochemistry of ocean microbes.