Tutorials

Here are some tutorials for how to process metagenomes using MOCAT.

In general, once MOCAT is setup, you can execute 'runMOCAT.sh', which will execute all the needed commands to generate assemblies, gene predictions, gene catalogs, gene catalog annotations, functional or taxonomic profiles. Or you can run a number of individual MOCAT.pl commands to process the samples in the way you want.

1 - Assemble a metagenome, predict gene, build & annotate a gene catalog and generate funcitonal profiles

This is perhaps one of the most common starting points when analyzing metagenomes. Executing these commands will first pre-process the reads, assemble them, predict genes, cluster them into a reference catalog, annotate the catalog, and finally generate functional profiles.

> MOCAT.pl -sf my.samples -rtf
> MOCAT.pl -sf my.samples -a
> MOCAT.pl -sf my.samples -gp assembly
> MOCAT.pl -sf my.samples -make_gene_catalog -assembly_type assembly
> MOCAT.pl -sf my.samples -annotate_gene_catalog
> MOCAT.pl -sf my.samples -s my.samples.padded -identity 95
> MOCAT.pl -sf my.samples -f my.samples.padded -identity 95
> MOCAT.pl -sf my.samples -p my.samples.padded -identity 95 -mode functional
> MOCAT.pl -sf my.samples -ss

If the sample file my.samples contains 1 sample called SAMPLE, the revised assembled sequences are stored in this file:

SAMPLE/assembly.reads.processed.solexaqa.K*/SAMPLE.assembly.reads.processed.solexaqa.K*.scaftig.gz

The predicted genes are stored in these files:

SAMPLE/gene.prediction.assembly.reads.processed.solexaqa.K*.MetaGeneMark.500/SAMPLE.gene.prediction.assembly.reads.processed.solexaqa.K*.MetaGeneMark.f*a

The clustered genes (gene catalog):

GENE_CATALOGS/my.samples/catalog/my.samples.fna

The clustered proteins (gene catalog):

GENE_CATALOGS/my.samples/catalog/my.samples.faa

The padded catalog used for profiling (with sequences down- and upstream of the genes):

GENE_CATALOGS/my.samples/padded_catalog/my.samples.padded

The gene catalog annotations:

GENE_CATALOG_ANNOTATIONS/my.samples/my.samples.functional.map

The gene profiles:

PROFILES/gene.profiles/my.samples/my.samples.*.zip

The functional profiles:

PROFILES/functional.profiles/my.samples/my.samples.*.zip

An Excel summary is stored in:

my.samples.summary.xlsx

2 - Generate taxonomic profiles (mOTU-LOG, specI & NCBI)

MOCAT2 comes with two databases for generating taxonomic profiles either using NCBI taxnoomic levels + specI species clusters (described in Mende et al. 2013), or metagenomic OTU linkage groups (mOTU-LGs; described in Sunagawa, et al. 2013). Follow this tutorial for how to generate such profiles.