MetaMap: An atlas of metatranscriptomic reads in human disease-related RNA-seq data
Cast your vote
You can rate an item by clicking the amount of stars they wish to award to this item.
When enough users have cast their vote on this item, the average rating will also be shown.
Your vote was cast
Thank you for your feedback
Thank you for your feedback
AuthorsSimon, L. M.
Westermann, A. J.
Elbehery, A. H.A.
Theis, F. J.
MetadataShow full item record
AbstractBackground: With the advent of the age of big data in bioinformatics, large volumes of data and high-performance computing power enable researchers to perform re-analyses of publicly available datasets at an unprecedented scale. Ever more studies imply the microbiome in both normal human physiology and a wide range of diseases. RNA sequencing technology (RNA-seq) is commonly used to infer global eukaryotic gene expression patterns under defined conditions, including human disease-related contexts; however, its generic nature also enables the detection of microbial and viral transcripts. Findings: We developed a bioinformatic pipeline to screen existing human RNA-seq datasets for the presence of microbial and viral reads by re-inspecting the non-human-mapping read fraction. We validated this approach by recapitulating outcomes from six independent, controlled infection experiments of cell line models and compared them with an alternative metatranscriptomic mapping strategy. We then applied the pipeline to close to 150 terabytes of publicly available raw RNA-seq data from more than 17,000 samples from more than 400 studies relevant to human disease using state-of-the-art high-performance computing systems. The resulting data from this large-scale re-analysis are made available in the presented MetaMap resource. Conclusions: Our results demonstrate that common human RNA-seq data, including those archived in public repositories, might contain valuable information to correlate microbial and viral detection patterns with diverse diseases. The presented MetaMap database thus provides a rich resource for hypothesis generation toward the role of the microbiome in human disease. Additionally, codes to process new datasets and perform statistical analyses are made available.
CitationGigascience. 2018 Jun 1;7(6). pii: 5036539. doi: 10.1093/gigascience/giy070.
AffiliationHIRI, Helmholtz-Institut für RNA-basierte Infektionsforschung, Josef-Shneider Strasse 2, 97080 Würzburg, Germany.
PublisherOxford University Press
The following license files are associated with this item:
- Creative Commons
Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 4.0 International
- Functional Profiling of Unfamiliar Microbial Communities Using a Validated De Novo Assembly Metatranscriptome Pipeline.
- Authors: Davids M, Hugenholtz F, Martins dos Santos V, Smidt H, Kleerebezem M, Schaap PJ
- Issue date: 2016
- RNA CoMPASS: a dual approach for pathogen and host transcriptome analysis of RNA-seq datasets.
- Authors: Xu G, Strong MJ, Lacey MR, Baribault C, Flemington EK, Taylor CM
- Issue date: 2014
- SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.
- Authors: Johnson BK, Scholz MB, Teal TK, Abramovitch RB
- Issue date: 2016 Feb 4
- QuickRNASeq lifts large-scale RNA-seq data analyses to the next level of automation and interactive visualization.
- Authors: Zhao S, Xi L, Quan J, Xi H, Zhang Y, von Schack D, Vincent M, Zhang B
- Issue date: 2016 Jan 8
- ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads.
- Authors: Maarala AI, Bzhalava Z, Dillner J, Heljanko K, Bzhalava D
- Issue date: 2018 Mar 15