Metagenome report

Criteria

  • Manuscripts must be submitted in doc, doc(x), rtf, or the OpenOffice equivalent using the Metagenome report template. Any manuscript not using the template will be returned to authors.
  • Manuscripts that are missing key figures, tables and references (as per the guidelines below), or that include mandatory tables that deviate from the accepted layout will not be accepted for peer review and will be returned to authors for correction.
  • Please use the Genome Report Checklist before submitting your manuscript.

The rationale of the content model is to provide information that is consistently and uniformly presented for rapid and easy consumption by both human and machine readers. First-time authors are advised to familiarize themselves with the appropriate article type(s) and to carefully follow the SIGS style sheet to ensure an expedited review.

Preparing your manuscript

Title

A concise and meaningful title should convey to readers the nature of the article. Metagenome reports should include the name of the study, biosample or sequencing project and if possible inform the reader about the reason for sequencing this sample. (We note that by metagenome we refer to a shotgun metagenome, additional 16S data sets used in the analysis of the shotgun metagenomes can be briefly described).

Authors

Authors should include each person who contributed to the generation, analysis and interpretation of the data that is being reported. Full author names should be as: (surname(s), middle initial(s), family name), using proper capitalization. For an example of properly formatted author and institutional information see http://standardsingenomics.biomedcentral.com/articles/10.1186/1944-3277-9-2.

Institutional Affiliation

The institutional affiliation(s) of each author should be identified with a superscripted number in the order of first appearance in the author list. Author affiliations should be numbered, using Vancouver style.

Corresponding author

The author responsible for the submission and coordination of communication with the editorial office during peer review and with readers post publication should be identified along with their institutional email address. The editorial office will deal with only one corresponding author on a given manuscript.

Abstract (Heading 1)

Authors should provide a concise, non-redundant and meaningful abstract that describes the nature of the article. It should summarize the rationale, the objectives and the findings of the report and provide key details (e.g., relevant IMG/M or MG-RAST identifiers, other project metadata that are accessible in a standardized form).

Key words (Heading 1)

Authors should include five to seven descriptive keywords. These may include the article type, the sampling site and other significant details about the nature of the study.

Abbreviations (Heading 1)

Authors should include any non-standard abbreviations that are used throughout the article. Do not include well-known abbreviations (e.g., NCBI, EMBL, DNA, RNA) and do not use non-standard abbreviations for organism names. Species and subspecies names must be fully spelled out on first use as binomials (genus name and species epithet) or trinomials (genus name, species epithet subsp. subspecific epithet). Following first usage, the genus name may be abbreviated by using the first letter of the genus name, followed by a period and the epithets. Authors should not include abbreviations that are only used once or twice in the manuscript. Abbreviations should not be redefined in the body of the article. If abbreviations are used, the article must include an abbreviations section.

Introduction (Heading 1)

Include a brief, high-level description of the study, the source material (biosample) and the rationale for its selection for sequencing. Indicate whether or not the metagenome(s) is/are part of a larger project (i.e., a study).

Site Information (Heading 2)

Describe the sampling site(s) and the rationale for selecting the samples or why the specific sample was chosen for sequencing.

Metagenome sequencing information (Heading 1)

Metagenome project history (Heading 2)

Introductory paragraph – This section of the manuscript should provide a detailed summary of the sequencing, and bioinformatics methodology. The section should include an introductory paragraph that provides the readers with specific information about the project, when the project began, when the project was completed, and which public databases contain the project data and other relevant information. These data should be summarized in Table 1.

Sample information (Heading 2)

Describe the location for all the samples used in your study, also include Times etc. Using the MIGS designation [cite MIGS reference].

Sample preparation, DNA extraction, library generation, and sequencing technology (Heading 1)

Describe the sample preparation, nucleic acid extraction, the generation of sequencing libraries, and the sequencing in sufficient detail to allow others to reproduce the reported results. The descriptions below are meant to characterize the sequence sets submitted to the archive; any additional experiments or treatments that were not used in the final submission should not be mentioned.

Sample preparation (collection, transport, and storage) (Heading 1)

Describe protocols used in the collection and preparation of each sample in Table 2.

Example:  Soil samples s1 and s2 were collected using 2 cm diameter soil cores sampling down to 10 cm soil depth, whereas samples s3 and s4 were collected at 50 cm soil depth. Soil samples were immediately placed on dry ice after collection, and then stored at -80C. Samples were thawed at 4C overnight before DNA extraction.

If any samples were prepared differently provide an additional paragraph using the sample labels used in Table 2.

DNA extraction (kits used, protocols used) (Heading 2)

Describe in full the procedures used for sample handling and DNA extraction of each sample in Table 2.

Example: Soil DNA extractions were completed using a Mobio PowerSoil DNA isolation kit with 0.25g of material. DNA extracts were quantified using an Invitrogen Picogreen assay and a plate reader to determine the concentration of the DNA extracts.

Library generation (kits used, protocols used) (Heading 2)

Describe in full the procedures used for sequencing library generation for each library in Table 3.

Example: DNA was quantified using the Invitrogen Qubit High-Sensitivity Assay, then sheared to the appropriate size range (200-400 bp) using a Covaris Sonicator. After shearing, libraries were generated using an Illumina TruSeq Library Prep kit following the standard protocol. Libraries were then carefully size selected using the Sage Blue Pippin for a tight insert of 180 bp. Insert sizes were verified using the Agilent Bioanalyzer.

Sequencing technology (Heading 2)

Describe in full the sequencing that was performed. Include information about parameters chosen for sequencing, e.g. insert size, paired end vs single end, etc.

Provide a summary of the resulting data, including number of bp (GBp) generated and the average read length. Example: Sequencing was completed on an Illumina HiSeq 2000. 23Gb of data was generated from a single 2×100bp paired-end sequencing lane. Each sample yielded roughly 5.75 Gb of data, as samples were run 4 per lane. Since these were soil metagenomes, we ensured that we had a tight 180bp insert size so that there would be enough overlap during 2×100bp sequencing to join the reads.

Sequence processing, annotation, and data analysis (Heading 1)

Briefly describe bioinformatics procedures used in the analysis of the library and metagenomes. If assembly was performed with one or more libraries, please describe the procedure. Include sufficient detail to allow for readers to reproduce the authors’ work.

Example: Libraries L1 and L2 were assembled into metagenome 1 using metavelvet with default parameters, library L3 was not assembled and used as metagenome 2. The metagenomes were processed with IMG/M using the default pipeline described in  [reference for: Markowitz, V.M. et al, Nucl. Acids Res (2014) doi:10.1093/nar/gkt919] using the default parameters.

Sequence processing (Heading 2)

Describe in full any quality control, clustering, or assembly performed. Describe each of the libraries sequenced in the library information table. (Table 3)

Metagenome processing (Heading 2)

Describe the processing steps used to create the metagenome from one or more details at a level that allows the reader to reproduce those steps.

Metagenome annotation (Heading 2)

Describe in detail the annotation procedures used to extract and describe features.

Post-processing (optional, Heading 2)

Describe any additional tools used for analyzing the data (prior to presenting results in section 4. Metagenome properties).

Metagenome Properties (Heading 1)

This corresponds to the Results section. Describe the properties of the metagenome (e.g., number of contigs, overall size in gigabasepairs, number of features, etc).  Provide a section describing the taxonomic and functional properties extracted. Describe any comparisons with other metagenomes. Also include additional results if suitable.

Include Table 7 describing the properties.

Taxonomic diversity (Heading 2)

Authors are encouraged to describe their metagenomes at the level suitable (e.g., family, order, class, phylum, etc.) for their research and to include information on how the counts were obtained; e.g., are the values absolute, were they normalized (if then how), etc. Authors also can use relative abundance.  Taxonomic composition data for the metagenome should be shown in either a table or a rank abundance graph.  Authors may provide an optional rarefaction graphic if multiple metagenomes are analyzed.

Include a table (Table 8) with absolute or relative abundance values for the taxa observed in the metagenome(s).  Authors are encouraged to use NCBI taxonomy terms to describe taxa and provide the data or version of said taxonomy. Alternative taxonomies can be used.  Authors should use the metagenome or library labels as appropriate in Table 8 and in the text. Please add additional columns if required using the metagenome identifier as the column heading.

Functional diversity (Heading 2)

Provide a table (Table 9) with a functional breakdown of your samples (using supported portal IDs as column headings). Authors are free to choose a namespace (COG, SEED Subsystems, KEGG, etc.) and also free to choose absolute or relative values. Any  relative normalized values require a description of the procedure for obtaining them.

Additional Results (optional, Heading 2)

Describe any additional results in detail, where any bioinformatic procedures used are described in Section Postprocessing. Please provide no more than four paragraphs and add at most two tables or two figures to support your analysis.

Conclusions (Heading 1)

Describe any conclusions in one paragraph.

Competing interests (Heading 1)

All financial and non-financial competing interests must be declared in this section. See our editorial policies for a full explanation of competing interests. If you are unsure whether you or any of your co-authors have a competing interest please contact the editorial office.

Funding (Heading 1)

All sources of funding for the research reported should be declared. The role of the funding body in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript should be declared.

Authors' contributions (Heading 1)

The individual contributions of authors to the manuscript should be specified in this section. Guidance and criteria for authorship can be found in our editorial policies.

Acknowledgements (optional, Heading 1)

Please acknowledge anyone who contributed towards the article who does not meet the criteria for authorship including anyone who provided professional writing services or materials.

Authors should obtain permission to acknowledge from all those mentioned in the Acknowledgements section.

See our editorial policies for a full explanation of acknowledgements and authorship criteria.

Group authorship: if you would like the names of the individual members of a collaboration group to be searchable through their individual PubMed records, please ensure that the title of the collaboration group is included on the title page and in the submission system and also include collaborating author names as the last paragraph of the “Acknowledgements” section. Please add authors in the format First Name, Middle initial(s) (optional), Last Name. You can add institution or country information for each author if you wish, but this should be consistent across all authors.

Please note that individual names may not be present in the PubMed record at the time a published article is initially included in PubMed as it takes PubMed additional time to code this information.

Authors' information (optional) (Heading 1)

You may choose to use this section to include any relevant information about the author(s) that may aid the reader's interpretation of the article, and understand the standpoint of the author(s). This may include details about the authors' qualifications, current positions they hold at institutions or societies, or any other relevant background information. Please refer to authors using their initials. Note this section should not be used to describe any competing interests.

Endnotes (optional) (Heading 1)

Endnotes should be designated within the text using a superscript lowercase letter and all notes (along with their corresponding letter) should be included in the Endnotes section. Please format this section in a paragraph rather than a list.

References (Heading 1)

 

All references, including URLs, must be numbered consecutively, in square brackets, in the order in which they are cited in the text, followed by any in tables or legends. The reference numbers must be finalized and the reference list fully formatted before submission.

Examples of the BioMed Central reference style are shown below. Please ensure that the reference style is followed precisely.

See our editorial policies for author guidance on good citation practice.

Web links and URLs: All web links and URLs, including links to the authors' own websites, should be given a reference number and included in the reference list rather than within the text of the manuscript. They should be provided in full, including both the title of the site and the URL, as well as the date the site was accessed, in the following format: The Mouse Tumor Biology Database. http://tumor.informatics.jax.org/mtbwi/index.do. Accessed 20 May 2013. If an author or group of authors can clearly be associated with a web link (e.g. for blogs) they should be included in the reference.

Example reference style:

Article within a journal
Smith JJ. The world of science. Am J Sci. 1999;36:234-5.

Article within a journal (no page numbers)
Rohrmann S, Overvad K, Bueno-de-Mesquita HB, Jakobsen MU, Egeberg R, Tjønneland A, et al. Meat consumption and mortality - results from the European Prospective Investigation into Cancer and Nutrition. BMC Med. 2013;11:63.

Article within a journal by DOI
Slifka MK, Whitton JL. Clinical implications of dysregulated cytokine production. Dig J Mol Med. 2000; doi:10.1007/s801090000086.

Article within a journal supplement
Frumin AM, Nussbaum J, Esposito M. Functional asplenia: demonstration of splenic activity by bone marrow scan. Blood 1979;59 Suppl 1:26-32.

Book chapter, or an article within a book
Wyllie AH, Kerr JFR, Currie AR. Cell death: the significance of apoptosis. In: Bourne GH, Danielli JF, Jeon KW, editors. International review of cytology. London: Academic; 1980. p. 251-306.

OnlineFirst chapter in a series (without a volume designation but with a DOI)
Saito Y, Hyuga H. Rate equation approaches to amplification of enantiomeric excess and chiral symmetry breaking. Top Curr Chem. 2007. doi:10.1007/128_2006_108.

Complete book, authored
Blenkinsopp A, Paxton P. Symptoms in the pharmacy: a guide to the management of common illness. 3rd ed. Oxford: Blackwell Science; 1998.

Online document
Doe J. Title of subordinate document. In: The dictionary of substances and their effects. Royal Society of Chemistry. 1999. http://www.rsc.org/dose/title of subordinate document. Accessed 15 Jan 1999.

Online database
Healthwise Knowledgebase. US Pharmacopeia, Rockville. 1998. http://www.healthwise.org. Accessed 21 Sept 1998.

Supplementary material/private homepage
Doe J. Title of supplementary material. 2000. http://www.privatehomepage.com. Accessed 22 Feb 2000.

University site
Doe, J: Title of preprint. http://www.uni-heidelberg.de/mydata.html (1999). Accessed 25 Dec 1999.

FTP site
Doe, J: Trivial HTTP, RFC2169. ftp://ftp.isi.edu/in-notes/rfc2169.txt (1999). Accessed 12 Nov 1999.

Organization site
ISSN International Centre: The ISSN register. http://www.issn.org (2006). Accessed 20 Feb 2007.

Dataset with persistent identifier
Zheng L-Y, Guo X-S, He B, Sun L-J, Peng Y, Dong S-S, et al. Genome data from sweet and grain sorghum (Sorghum bicolor). GigaScience Database. 2011. http://dx.doi.org/10.5524/100012.

Preparing tables and figures

Table 1. Study information

Label

Metagenome Label

Comment

MG-RAST ID, MG-Portal ID or IMG/M ID

mgm4447971.3 or
 3300005161

At least one Identifier is required

SRA ID or ENA ID

 

SRA or ENA identifier required, this identifies each sequence set

Study

mgp128 orGs0099864

If this metagenome is part of a larger project what is the project name and identifier. (note: MG-RAST refers to the study as project)

GOLD ID (sequencing project)

Gp0111011

Required

GOLD ID (analysis project)

Ga0066807

Required

NCBI BIOPROJECT

-

Required

Relevance

e.g. Human oral health

 Required

Additional metagenomes should be represented by an additional column in each of the respective tables. Authors should note that tables should fit within the confines of a printed page (8.5 x 11 inches) with a printable area of 6.67 x 8.97 inches (width x height). Tables that exceed this size must be printed on multiple pages and have repeated headers and footers.

Table 2. Sample information

Label

Sample Label

Comment

GOLD ID (biosample)

Gb0110744

(optional)

Biome

 

Biomes are defined based on factors such as plant structures, leaf types, plant spacing, and other factors like climate. Biome should be treated as the descriptor of the broad ecological context of a sample. Examples include: desert, taiga, deciduous woodland, or coral reef. EnvO (v 2013- 06-14) terms can be found via the link: http://www.ebi.ac.uk/ontology-lookup/termSearch.do?ontologyName=ENVO&includeObsolete=true&termName=bio&termId=

Feature

Environmental feature level includes geographic environmental features. Compared to biome, feature is a descriptor of the more local environment. Examples include: harbor, cliff, or lake. EnvO (v 2013-06- 14) terms can be found via the link: : http://www.ebi.ac.uk/ontology-lookup/termSearch.do?ontologyName=ENVO&includeObsolete=true&termName=bio&termId

Material

 

The environmental material level refers to the material that was displaced by the sample, or material in which a sample was embedded, prior to the sampling event. Environmental material terms are generally mass nouns. Examples include: air, soil, or water. EnvO (v 2013-06-14) terms can be found via the link: http://www.ebi.ac.uk/ontology-lookup/termSearch.do?ontologyName=null&includeObsolete=true&termName=materi&termId=

Latitude and Longitude

39.481448, 0.353066

(latitude and longitude) GPS coordinates

Vertical distance

-10m below sea  level

Depth/altitude or elevation (metric units)

Geographic location

Gulf of Mexico , Mexico

(country and/or sea,region)

Collection date and time

21/05/15, 13:30h (GMT)

Date and time

Table 3. Library information. (Describe the creation of each library)

Label

Library Label

 Comment

Sample Label(s)

Sample Label(s)

Sample(s) used for this library.

Sample prep method

Mobio PowerSoil DNA Isolation Kit

How was the DNA isolated from the sample?

Library prep method(s)

Illumina TruSeq DNA

How was the sequencing library created?

Sequencing platform(s)

Illumina HiSeq 2000

Which sequencing platform was used.

Sequencing chemistry

V3 SBS Kit

Chemistry, version of kit. This typically will have to be obtained from the sequencing provider.

Sequence size (GBp)

9.5GBp

How many gigabases were sequenced?

Number of reads

105,012,415

How many reads were obtained

Single-read or paired-end sequencing?

Paired-end

One of single-read, paired-end, or empty.

Sequencing library insert size

400

Size of the fragments subjected to end sequencing

Average read length

150

 

Standard deviation for read length

1

 

Table 4. Sequence processing. (Describe the processing for each library)

Label

Library label

Comment

Tool(s) used for quality control

 

MG-RAST (default)

Names of the tools used for sequence quality control add parameters used in parenthesis

Number of sequences removed by quality control procedures

97,722

Number of sequences removed by QC

Number of sequences that passed quality control procedures

10,051,251

Number of sequences after QC

Number of artificial duplicate reads

48,710

Number of artificial duplicate reads identified (see doi:10.1038/ismej.2009.72 )

Describe the processing steps used to create the metagenome from one or more details at a level that allows the reader to reproduce those steps.

Table 5. Metagenome statistics

Label

Metagenome Label

Comment

Libraries used

Library label(s)

List all the libraries used for this metagenome.

Assembly tool(s) used

Metavelvet (default)

Name of the tool used for assembly add parameters used in parenthesis. Use NA if no assembly was performed.

Number of contigs after assembly

508,671

 

Number of singletons after assembly

8,091,128

 

minimal contig length

300

Minimal contig length (if filtering for length was used)

Total bases assembled

121,123,345

Total base pairs in assembly

Contig n50

854

   

% of Sequences assembled

28%

Fraction of the input data in the assembly.

Measure for % assembled

from assembly output

Method used for calculating % assembled (either from assembly output or via mapping of reads to contigs; in the latter case provide tool and parameters used)

 Describe in detail the annotation procedures used to extract and describe features.

Table 6. Annotation parameters

Label

Metagenome Label

Comment

Annotation system

 

Examples are: MG-RAST, IMG/M

Gene calling program

 

Leave blank if using built-in default, otherwise provide name and version and add parameters used in parenthesis if applicable

Annotation algorithm

 

Leave blank if using built-in default, otherwise provide name and version and add parameters used in parenthesis

Database(s) used

SEED, Silva

List the name of the databases(s) used for analysis.

 This table presents the results of the sequence analysis (potential assembly of one or more libraries, etc).

Table 7. Metagenome properties

Label

Metagenome Label

Comment

Number of contigs

103,124

The number of contigs (and singletons) included in the analysis

GBp

3.4 GBp

Size of the data in gigabasepairs you used for analysis.

Number of features identified

97,000

Total number of features identified

CDS

89,700

Total number of protein coding genes

rRNA

730

Total number of ribosomal RNA genes

others

0

Number of other features

CDSs with COG

42,010

Number of features with the number of COG, Pfam or SEED subsystem annotations. One or more are required.

CDSs with Pfam

 

CDS with SEED subsystem

 

Alpha diversity

480

The number of species you predict from your sequence data.

Absolute or relative abundance values for the taxa observed in the metagenome(s).  Authors are encouraged to use NCBI taxonomy terms to describe taxa and provide the data or version of said taxonomy. Alternative taxonomies can be used.  Authors should use the metagenome or library labels as appropriate in Table 8 and in the text.

Table 8. Taxonomic composition.

Phylum

 Metagenome Label 1

Metagenome Label 2

Acidobacteria347124
Actinobacteria86,95179,623
Annelida21
Apicomplexa94108
Aquificae500280
...

Provide a table with a functional breakdown of your samples (using supported portal IDs as column headings). Authors are free to choose a namespace (COG, SEED Subsystems, KEGG, etc.) and also free to choose absolute or relative values. Any relative normalized values require a description of the procedure for obtaining them.

Table 9. Functional diversity

COG Category

Metagenome Label 1

 Metagenome Label 2

 Metagenome Label 3

CELLULAR PROCESSES AND SIGNALING

6,209

19,895

23,681

INFORMATION STORAGE AND PROCESSING

7,038

23,467

27,370

METABOLISM

13,173

40,780

47,663

POORLY CHARACTERIZED

5,222

18,941

19,795

Please note: We leave it to the author(s) to determine which level of resolution to choose. Authors should feel free to choose a more specific category of metabolism; e.g., DNA Metabolism.

Instructions for additional tables

Any additional tables must be formatted in the same manner as tables one through five. This format is detailed below in an example table:

Table Number. Table title.

Table header

-

-

-

Row 1

-

-

-

Row 2

-

-

-

Row 3

-

-

-

Table footer

Table title must not exceed one row and this row must begin with the table number. The top row of the table must have a top and bottom border and the bottom row of the table must have a bottom border. 12pt Times New Roman must be used for both table title and header. Table rows and footer must be 10 pt Times New Roman. Table footer is used to explain different elements. Authors should use superscript a, b, c, etc. to refer to the element which is being described (as in table one). All additional tables should be able to fit in a 10 × 9 space. Font should be no smaller than 10 pt. additional tables normally only appear in an extended genome report.

Figures and figure legends

The legends should be included in the main manuscript text file at the end of the document, rather than being a part of the figure file. For each figure, the following information should be provided: Figure number (in sequence, using Arabic numerals - i.e. Figure 1, 2, 3 etc.); short title of figure (maximum 15 words); detailed legend, up to 300 words.

Please note that it is the responsibility of the author(s) to obtain permission from the copyright holder to reproduce figures or tables that have previously been published elsewhere.

Figure 1. Rank abundance graph (optional)
Authors can provide an optional figure illustrating the rank abundance analysis. Authors should use the metagenome or library labels as appropriate. Authors can choose the level of resolution for the analysis (e.g., phylum, order, genus, etc.). The example provided shows only one metagenome, authors can include multiple metagenomes using different colors.

Figure 2. Rarefaction analysis (optional)
Authors can provide an optional figure for rarefaction analysis. Authors should use the metagenome or library labels as appropriate

Submit your manuscript in Editorial Manager

Advertisement