group leader: Dr. Westermann
MetaMap: An atlas of metatranscriptomic reads in human disease-related RNA-seq dataBackground: With the advent of the age of big data in bioinformatics, large volumes of data and high-performance computing power enable researchers to perform re-analyses of publicly available datasets at an unprecedented scale. Ever more studies imply the microbiome in both normal human physiology and a wide range of diseases. RNA sequencing technology (RNA-seq) is commonly used to infer global eukaryotic gene expression patterns under defined conditions, including human disease-related contexts; however, its generic nature also enables the detection of microbial and viral transcripts. Findings: We developed a bioinformatic pipeline to screen existing human RNA-seq datasets for the presence of microbial and viral reads by re-inspecting the non-human-mapping read fraction. We validated this approach by recapitulating outcomes from six independent, controlled infection experiments of cell line models and compared them with an alternative metatranscriptomic mapping strategy. We then applied the pipeline to close to 150 terabytes of publicly available raw RNA-seq data from more than 17,000 samples from more than 400 studies relevant to human disease using state-of-the-art high-performance computing systems. The resulting data from this large-scale re-analysis are made available in the presented MetaMap resource. Conclusions: Our results demonstrate that common human RNA-seq data, including those archived in public repositories, might contain valuable information to correlate microbial and viral detection patterns with diverse diseases. The presented MetaMap database thus provides a rich resource for hypothesis generation toward the role of the microbiome in human disease. Additionally, codes to process new datasets and perform statistical analyses are made available.
CRP-cAMP mediates silencing of Salmonella virulence at the post-transcriptional level.Invasion of epithelial cells by Salmonella enterica requires expression of genes located in the pathogenicity island I (SPI-1). The expression of SPI-1 genes is very tightly regulated and activated only under specific conditions. Most studies have focused on the regulatory pathways that induce SPI-1 expression. Here, we describe a new regulatory circuit involving CRP-cAMP, a widely established metabolic regulator, in silencing of SPI-1 genes under non-permissive conditions. In CRP-cAMP-deficient strains we detected a strong upregulation of SPI-1 genes in the mid-logarithmic growth phase. Genetic analyses revealed that CRP-cAMP modulates the level of HilD, the master regulator of Salmonella invasion. This regulation occurs at the post-transcriptional level and requires the presence of a newly identified regulatory motif within the hilD 3'UTR. We further demonstrate that in Salmonella the Hfq-dependent sRNA Spot 42 is under the transcriptional repression of CRP-cAMP and, when this transcriptional repression is relieved, Spot 42 exerts a positive effect on hilD expression. In vivo and in vitro assays indicate that Spot 42 targets, through its unstructured region III, the 3'UTR of the hilD transcript. Together, our results highlight the biological relevance of the hilD 3'UTR as a hub for post-transcriptional control of Salmonella invasion gene expression.