Correcting index databases improves metagenomic studies

Guillaume Méric ORCID logo; Ryan R Wick ORCID logo; Stephen C Watts; Kathryn E Holt ORCID logo; Michael Inouye ORCID logo; (2019) Correcting index databases improves metagenomic studies. bioRxiv. DOI: 10.1101/712166
Copy

<jats:title>Abstract</jats:title><jats:p>Assessing the taxonomic composition of metagenomic samples is an important first step in understanding the biology and ecology of microbial communities in complex environments. Despite a wealth of algorithms and tools for metagenomic classification, relatively little effort has been put into the critical task of improving the quality of reference indices to which metagenomic reads are assigned. Here, we inferred the taxonomic composition of 404 publicly available metagenomes from human, marine and soil environments, using custom index databases modified according to two factors: the number of reference genomes used to build the databases, and the monophyletic strictness of species definitions. Index databases built following the NCBI taxonomic system were also compared to others using Genome Taxonomy Database (GTDB) taxonomic redefinitions. We observed a considerable increase in the rate of read classification using modified reference index databases as compared to a default NCBI RefSeq database, with up to a 4.4-, 6.4- and 2.2-fold increase in classified reads per sample for human, marine and soil metagenomes, respectively. Importantly, targeted correction for 70 common human pathogens and bacterial genera in the index database increased their specific detection levels in human metagenomes. We also show the choice of index database can influence downstream diversity and distance estimates for microbiome data. Overall, the study shows a large amount of accessible information in metagenomes remains unexploited using current methods, and that the same data analysed using different index databases could potentially lead to different conclusions. These results have implications for the power and design of individual microbiome studies, and for comparison and meta-analysis of microbiome datasets.</jats:p>


picture_as_pdf
Meric_etal_2019_Correcting_index_databases_improves_metagenomic_studies.pdf
subject
Published Version
Available under Creative Commons: NC-ND 3.0

View Download

Atom BibTeX OpenURL ContextObject in Span Multiline CSV OpenURL ContextObject Dublin Core Dublin Core MPEG-21 DIDL EndNote HTML Citation JSON MARC (ASCII) MARC (ISO 2709) METS MODS RDF+N3 RDF+N-Triples RDF+XML RIOXX2 XML Reference Manager Refer Simple Metadata ASCII Citation EP3 XML
Export

Downloads