How to cite MOCAT2

If you have used MOCAT2 in your work, please cite:

Kultima, J. R., Coelho, L. P., Forslund, K., Huerta-Cepas, J., Li, S. S., Driessen, Bork, P. (2016). MOCAT2: a metagenomic assembly, annotation and profiling framework. Bioinformatics . http://doi.org/10.1093/bioinformatics/btw183

Please read the full MOCAT2 (v2.0) article at Bioinformatics or the PDF version.

Please read the full MOCAT (v1.0) article at PLoS ONE or the PDF version.

MOCAT (version 1) citation: Kultima JR, Sunagawa S, Li J, Chen W, Chen H, et al. (2012) MOCAT: A Metagenomics Assembly and Gene Prediction Toolkit. PLoS ONE 7(10): e47656. doi:10.1371/journal.pone.0047656

MOCAT2 is a wrapper for 3rd party software. Therefore we strongly suggest you also cite the following papers if you use MOCAT2:

Initial read trimming and quality control
Cox MP, Peterson DA, Biggs PJ (2010) SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data. BMC bioinformatics 11: 485 doi:10.1186/1471-2105-11-485.
FastX program

Mapping reads
Li R, Yu C, Li Y, Lam T-W, Yiu S-M, et al. (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics (Oxford, England) 25: 1966–1967 doi:10.1093/bioinformatics/btp336. doi: 10.1093/bioinformatics/btp336

Edgar RC (2010) Search and clustering orders of magnitude faster than BLAST. Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461

Assembly
Li R, Zhu H, Ruan J, Qian W, Fang X, et al. (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome research 20: 265–272 doi:10.1101/gr.097261.109.

Assembly revision
Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England) 25: 1754–1760 doi:10.1093/bioinformatics/btp324.

Gene Prediciton
Hyatt D, Chen G-L, Locascio PF, Land ML, Larimer FW, et al. (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC bioinformatics 11: 119 doi:10.1186/1471-2105-11-119.
Zhu W, Lomsadze A, Borodovsky M (2010) Ab initio gene identification in metagenomic sequences. Nucleic acids research 38: 1–15 doi:10.1093/nar/gkq275.

Retrieving Marker Genes
Sunagawa S., et al. (2013) Metagenomic species profiling using universal phylogenetic marker genes. Nature Methods 10, 1196–1199 doi:10.1038/nmeth.2693

Clustering gene catalogs
Limin Fu, et al. (2012) CD-HIT: accelerated for clustering the next generation sequencing data. Bioinformatics. doi: 10.1093/bioinformatics/bts565

Annotating gene catalogs
Buchfink, B., et al. (2014) Fast and sensitive protein alignment using DIAMOND. Nat. Methods, 12, 59–60.
Arumugam M., et al. (2010) SmashCommunity: a metagenomic annotation and analysis tool. Bioinformatics. doi: 10.1093/bioinformatics/btq536.

Taxnomic profiles
mOTU-LG: Sunagawa,S. et al. (2013) Metagenomic species profiling using universal phylogenetic marker genes. Nat. Methods, 10, 1196–9
specI & NCBI: Mende,D.R. et al. (2013) Accurate and universal delineation of prokaryotic species. Nat. Methods, 10, 881–4.

Pre-compiled reference gene catalogs
IGC (human gut): Li,J. et al. (2014) An integrated catalog of reference genes in the human gut microbiome. Nat. Biotechnol., 32, 834–41.
CRC-RGC (human gut): Zeller,G. et al. (2014) Potential of fecal microbiota for early-stage detection of colorectal cancer. Mol. Syst. Biol., 10, 766.
skin-RGC (human skin): Oh,J. et al. (2014) Biogeography and individuality shape function in the human skin metagenome. Nature, 514, 59–64.
mouse-RGC (human skin): Xiao,L. et al. (2015) A catalog of the mouse gut metagenome. Nat Biotech, 33, 1103–1108.
OM-RGC (ocean): Sunagawa,S. et al. (2015) Structure and function of the global ocean microbiome. Science, 348 (6237), 1:10

Functional profiles
eggNOG: Huerta-Cepas, J., et al. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res 2015. 10.1093/nar/gkv1248.
ARDB: Liu,B. and Pop,M. (2009) ARDB—Antibiotic Resistance Genes Database. Nucleic Acids Res. , 37 , D443–D447.
CARD: McArthur, A.G., et al. The comprehensive antibiotic resistance database. Antimicrob Agents Chemother 2013;57(7):3348-3357. 10.1128/AAC.00419-13.
DBETH: Chakraborty, A., et al. DBETH: a Database of Bacterial Exotoxins for Human. Nucleic Acid Res. 2012;40 Database issue):D615-20 0.1093/nar/gkr942
dbCAN: Yin, Y., et al. dbCAN: a web resource for automated carbohydrate-active enzyme annotation. Nucleic Acids Res 2012;40(Web Server issue):W445-451. 10.1093/nar/gks479.
DrugBank: Knox C., et al. DrugBank 3.0: a comprehensive resource for 'omics' research on drugs. Nucleic Acids Res 2011;39(Database issue):D1035-41. 10.1093/nar/gkq1126
ICEberg: Bi D., et al. ICEberg: a web-based resource for integrative and conjugative elements found in Bacteria. Nucleic Acids Res. 2012 Jan;40(Database issue):D621-6. 10.1093/nar/gkr846.
KEGG: Kanehisa, M., et al. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res 2014;42(Database issue):D199-205. 10.1093/nar/gkt1076.
MetaCyc: Caspi, R., et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 2015. 10.1093/nar/gkv1164.
MvirDB: Zhou, C.E., et al. MvirDB--a microbial database of protein toxins, virulence factors and antibiotic resistance genes for bio-defence applications. Nucleic Acids Res 2007;35(Database issue):D391-394. 10.1093/nar/gkl791.
PATRIC: Mao, C., et al. Curation, integration and visualization of bacterial virulence factors in PATRIC. Bioinformatics 2015;31(2):252-258. 10.1093/bioinformatics/btu631.
Pfam: Finn, R.D., et al. Pfam: the protein families database. Nucleic Acids Res 2014;42(Database issue):D222-230. 10.1093/nar/gkt1223.
Prophages: Waller, A.S., et al. Classification and quantification of bacteriophage taxa in human gut metagenomes. ISME J 2014;8(7):1391-1402. 10.1038/ismej.2014.30.
Resfams: Gibson, M.K., et al. Improved annotation of antibiotic resistance determinants reveals microbial resistomes cluster by ecology. ISME J 9(1). 10.1038/ismej.2014.106.
SEED subsystems: Overbeek, R., et al. The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res 2014;42(Database issue):D206-214. 10.1093/nar/gkt1226.
Superfamily: Gough, J., et al. Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. J Mol Biol 2001;313(4):903-919. 10.1006/jmbi.2001.5080.
vFam: Skewes-Cox, P., et al. Profile hidden Markov models for the detection of viruses within metagenomic sequence data. PLoS One 2014;9(8):e105067. 10.1371/journal.pone.0105067.
VFDB: Chen, L., et al. VFDB 2012 update: toward the genetic diversity and molecular evolution of bacterial virulence factors. Nucleic Acids Res 2012;40(Database issue):D641-645. 10.1093/nar/gkr989.
Victors: Mao, C., et al. Curation, integration and visualization of bacterial virulence factors in PATRIC. Bioinformatics 2015;31(2):252-258. 10.1093/bioinformatics/btu631.