• Offspring born to influenza A virus infected pregnant mice have increased susceptibility to viral and bacterial infections in early life.

      Jacobsen, Henning; Walendy-Gnirß, Kerstin; Tekin-Bubenheim, Nilgün; Kouassi, Nancy Mounogou; Ben-Batalla, Isabel; Berenbrok, Nikolaus; Wolff, Martin; Dos Reis, Vinicius Pinho; Zickler, Martin; Scholl, Lucas; et al. (Springer Nature, 2021-08-16)
      Influenza during pregnancy can affect the health of offspring in later life, among which neurocognitive disorders are among the best described. Here, we investigate whether maternal influenza infection has adverse effects on immune responses in offspring. We establish a two-hit mouse model to study the effect of maternal influenza A virus infection (first hit) on vulnerability of offspring to heterologous infections (second hit) in later life. Offspring born to influenza A virus infected mothers are stunted in growth and more vulnerable to heterologous infections (influenza B virus and MRSA) than those born to PBS- or poly(I:C)-treated mothers. Enhanced vulnerability to infection in neonates is associated with reduced haematopoetic development and immune responses. In particular, alveolar macrophages of offspring exposed to maternal influenza have reduced capacity to clear second hit pathogens. This impaired pathogen clearance is partially reversed by adoptive transfer of alveolar macrophages from healthy offspring born to uninfected dams. These findings suggest that maternal influenza infection may impair immune ontogeny and increase susceptibility to early life infections of offspring.
    • Integration of metabolomics, genomics, and immune phenotypes reveals the causal roles of metabolites in disease.

      Chu, Xiaojing; Jaeger, Martin; Beumer, Joep; Bakker, Olivier B; Aguirre-Gamboa, Raul; Oosting, Marije; Smeekens, Sanne P; Moorlag, Simone; Mourits, Vera P; Koeken, Valerie A C M; et al. (BMC, 2021-07-06)
      Background: Recent studies highlight the role of metabolites in immune diseases, but it remains unknown how much of this effect is driven by genetic and non-genetic host factors. Result: We systematically investigate circulating metabolites in a cohort of 500 healthy subjects (500FG) in whom immune function and activity are deeply measured and whose genetics are profiled. Our data reveal that several major metabolic pathways, including the alanine/glutamate pathway and the arachidonic acid pathway, have a strong impact on cytokine production in response to ex vivo stimulation. We also examine the genetic regulation of metabolites associated with immune phenotypes through genome-wide association analysis and identify 29 significant loci, including eight novel independent loci. Of these, one locus (rs174584-FADS2) associated with arachidonic acid metabolism is causally associated with Crohn's disease, suggesting it is a potential therapeutic target. Conclusion: This study provides a comprehensive map of the integration between the blood metabolome and immune phenotypes, reveals novel genetic factors that regulate blood metabolite concentrations, and proposes an integrative approach for identifying new disease treatment targets.
    • Understanding and Engineering the Stereoselectivity of Humulene Synthase.

      Schotte, Carsten; Lukat, Peer; Deuschmann, Adrian; Blankenfeldt, Wulf; Cox, Russell J; HZI,Helmholtz-Zentrum für Infektionsforschung GmbH, Inhoffenstr. 7,38124 Braunschweig, Germany. (Wiley-VCH, 2021-06-28)
      The non-canonical terpene cyclase AsR6 is responsible for the formation of 2E,6E,9E-humulene during the biosynthesis of the tropolone sesquiterpenoid (TS) xenovulene A. The structures of unliganded AsR6 and of AsR6 in complex with an in crystallo cyclized reaction product and thiolodiphosphate reveal a new farnesyl diphosphate binding motif that comprises a unique binuclear Mg2+ -cluster and an essential K289 residue that is conserved in all humulene synthases involved in TS formation. Structure-based site-directed mutagenesis of AsR6 and its homologue EupR3 identify a single residue, L285/M261, that controls the production of either 2E,6E,9E- or 2Z,6E,9E-humulene. A possible mechanism for the observed stereoselectivity was investigated using different isoprenoid precursors and results demonstrate that M261 has gatekeeping control over product formation.
    • EpitopeVec: Linear Epitope Prediction Using Deep Protein Sequence Embeddings.

      Bahai, Akash; Asgari, Ehsaneddin; Mofrad, Mohammad R K; Kloetgen, Andreas; McHardy, Alice C; BRICS, Braunschweiger Zentrum für Systembiologie, Rebenring 56,38106 Braunschweig, Germany. (Oxford University Press, 2021-06-28)
      Motivation: B-cell epitopes (BCEs) play a pivotal role in the development of peptide vaccines, immuno-diagnostic reagents, and antibody production, and thus in infectious disease prevention and diagnostics in general. Experimental methods used to determine BCEs are costly and time-consuming. Therefore, it is essential to develop computational methods for the rapid identification of BCEs. Although several computational methods have been developed for this task, generalizability is still a major concern, where cross-testing of the classifiers trained and tested on different datasets has revealed accuracies of 51-53. Results: We describe a new method called EpitopeVec, which uses a combination of residue properties, modified antigenicity scales, and protein language model-based representations (protein vectors) as features of peptides for linear BCE predictions. Extensive benchmarking of EpitopeVec and other state-of-the-art methods for linear BCE prediction on several large and small datasets, as well as cross-testing, demonstrated an improvement in the performance of EpitopeVec over other methods in terms of accuracy and area under the curve (AUC). As the predictive performance depended on the species origin of the respective antigens (viral, bacterial, eukaryotic), we also trained our method on a large viral dataset to create a dedicated linear viral BCE predictor with improved cross-testing
    • Needs for an Integration of Specific Data Sources and Items - First Insights of a National Survey Within the German Center for Infection Research.

      Jakob, Carolin E M; Stecher, Melanie; Fuhrmann, Sandra; Wingen-Heimann, Sebastian; Heinen, Stephanie; Anton, Gabriele; Behnke, Michael; Behrends, Uta; Boeker, Martin; Castell, Stefanie; et al. (IOS Press, 2021-05-24)
      State-subsidized programs develop medical data integration centers in Germany. To get infection disease (ID) researchers involved in the process of data sharing, common interests and minimum data requirements were prioritized. In 06/2019 we have initiated the German Infectious Disease Data Exchange (iDEx) project. We have developed and performed an online survey to determine prioritization of requests for data integration and exchange in ID research. The survey was designed with three sub-surveys, including a ranking of 15 data categories and 184 specific data items and a query of available 51 data collecting systems. A total of 84 researchers from 17 fields of ID research participated in the survey (predominant research fields: gastrointestinal infections n=11, healthcare-associated and antibiotic-resistant infections n=10, hepatitis n=10). 48% (40/84) of participants had experience as medical doctor. The three top ranked data categories were microbiology and parasitology, experimental data, and medication (53%, 52%, and 47% of maximal points, respectively). The most relevant data items for these categories were bloodstream infections, availability of biomaterial, and medication (88%, 87%, and 94% of maximal points, respectively). The ranking of requests of data integration and exchange is diverse and depends on the chosen measure. However, there is need to promote discipline-related digitalization and data exchange.
    • Tutorial: assessing metagenomics software with the CAMI benchmarking toolkit.

      Meyer, Fernando; Lesker, Till-Robin; Koslicki, David; Fritz, Adrian; Gurevich, Alexey; Darling, Aaron E; Sczyrba, Alexander; Bremges, Andreas; McHardy, Alice C; BRICS, Braunschweiger Zentrum für Systembiologie, Rebenring 56,38106 Braunschweig, Germany. (Nature Research, 2021-03-01)
      Computational methods are key in microbiome research, and obtaining a quantitative and unbiased performance estimate is important for method developers and applied researchers. For meaningful comparisons between methods, to identify best practices and common use cases, and to reduce overhead in benchmarking, it is necessary to have standardized datasets, procedures and metrics for evaluation. In this tutorial, we describe emerging standards in computational meta-omics benchmarking derived and agreed upon by a larger community of researchers. Specifically, we outline recent efforts by the Critical Assessment of Metagenome Interpretation (CAMI) initiative, which supplies method developers and applied researchers with exhaustive quantitative data about software performance in realistic scenarios and organizes community-driven benchmarking challenges. We explain the most relevant evaluation metrics for assessing metagenome assembly, binning and profiling results, and provide step-by-step instructions on how to generate them. The instructions use simulated mouse gut metagenome data released in preparation for the second round of CAMI challenges and showcase the use of a repository of tool results for CAMI datasets. This tutorial will serve as a reference for the community and facilitate informative and reproducible benchmarking in microbiome research.
    • Hepatitis C reference viruses highlight potent antibody responses and diverse viral functional interactions with neutralising antibodies.

      Bankwitz, Dorothea; Bahai, Akash; Labuhn, Maurice; Doepke, Mandy; Ginkel, Corinne; Khera, Tanvi; Todt, Daniel; Ströh, Luisa J; Dold, Leona; Klein, Florian; et al. (BMJ Publisher. Group, 2020-12-15)
      Community-acquired pneumonia by primary or superinfections with Streptococcus pneumoniae can lead to acute respiratory distress requiring mechanical ventilation. The pore-forming toxin pneumolysin alters the alveolar-capillary barrier and causes extravasation of protein-rich fluid into the interstitial pulmonary tissue, which impairs gas exchange. Platelets usually prevent endothelial leakage in inflamed pulmonary tissue by sealing inflammation-induced endothelial gaps. We not only confirm that S pneumoniae induces CD62P expression in platelets, but we also show that, in the presence of pneumolysin, CD62P expression is not associated with platelet activation. Pneumolysin induces pores in the platelet membrane, which allow anti-CD62P antibodies to stain the intracellular CD62P without platelet activation. Pneumolysin treatment also results in calcium efflux, increase in light transmission by platelet lysis (not aggregation), loss of platelet thrombus formation in the flow chamber, and loss of pore-sealing capacity of platelets in the Boyden chamber. Specific anti-pneumolysin monoclonal and polyclonal antibodies inhibit these effects of pneumolysin on platelets as do polyvalent human immunoglobulins. In a post hoc analysis of the prospective randomized phase 2 CIGMA trial, we show that administration of a polyvalent immunoglobulin preparation was associated with a nominally higher platelet count and nominally improved survival in patients with severe S pneumoniae-related community-acquired pneumonia. Although, due to the low number of patients, no definitive conclusion can be made, our findings provide a rationale for investigation of pharmacologic immunoglobulin preparations to target pneumolysin by polyvalent immunoglobulin preparations in severe community-acquired pneumococcal pneumonia, to counteract the risk of these patients becoming ventilation dependent. This trial was registered at www.clinicaltrials.gov as #NCT01420744.
    • Longitudinal Multi-omics Analyses Identify Responses of Megakaryocytes, Erythroid Cells, and Plasmablasts as Hallmarks of Severe COVID-19.

      Bernardes, Joana P; Mishra, Neha; Tran, Florian; Bahmer, Thomas; Best, Lena; Blase, Johanna I; Bordoni, Dora; Franzenburg, Jeanette; Geisen, Ulf; Josephs-Spaulding, Jonathan; et al. (Elsevier (Cell Press), 2020-11-26)
      Temporal resolution of cellular features associated with a severe COVID-19 disease trajectory is needed for understanding skewed immune responses and defining predictors of outcome. Here, we performed a longitudinal multi-omics study using a two-center cohort of 14 patients. We analyzed the bulk transcriptome, bulk DNA methylome, and single-cell transcriptome (>358,000 cells, including BCR profiles) of peripheral blood samples harvested from up to 5 time points. Validation was performed in two independent cohorts of COVID-19 patients. Severe COVID-19 was characterized by an increase of proliferating, metabolically hyperactive plasmablasts. Coinciding with critical illness, we also identified an expansion of interferon-activated circulating megakaryocytes and increased erythropoiesis with features of hypoxic signaling. Megakaryocyte- and erythroid-cell-derived co-expression modules were predictive of fatal disease outcome. The study demonstrates broad cellular effects of SARS-CoV-2 infection beyond adaptive immune cells and provides an entry point toward developing biomarkers and targeted treatments of patients with COVID-19.
    • Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research.

      Hufsky, Franziska; Lamkiewicz, Kevin; Almeida, Alexandre; Aouacheria, Abdel; Arighi, Cecilia; Bateman, Alex; Baumbach, Jan; Beerenwinkel, Niko; Brandt, Christian; Cacciabue, Marco; et al. (Oxford Academic, 2020-11-04)
      SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories.
    • Severe COVID-19 Is Marked by a Dysregulated Myeloid Cell Compartment.

      Schulte-Schrepping, Jonas; Reusch, Nico; Paclik, Daniela; Baßler, Kevin; Schlickeiser, Stephan; Zhang, Bowen; Krämer, Benjamin; Krammer, Tobias; Brumhard, Sophia; Bonaguro, Lorenzo; et al. (Elsevier /Cell Press), 2020-08-05)
      Coronavirus disease 2019 (COVID-19) is a mild to moderate respiratory tract infection, however, a subset of patients progress to severe disease and respiratory failure. The mechanism of protective immunity in mild forms and the pathogenesis of severe COVID-19 associated with increased neutrophil counts and dysregulated immune responses remain unclear. In a dual-center, two-cohort study, we combined single-cell RNA-sequencing and single-cell proteomics of whole-blood and peripheral-blood mononuclear cells to determine changes in immune cell composition and activation in mild versus severe COVID-19 (242 samples from 109 individuals) over time. HLA-DRhiCD11chi inflammatory monocytes with an interferon-stimulated gene signature were elevated in mild COVID-19. Severe COVID-19 was marked by occurrence of neutrophil precursors, as evidence of emergency myelopoiesis, dysfunctional mature neutrophils, and HLA-DRlo monocytes. Our study provides detailed insights into the systemic immune response to SARS-CoV-2 infection and reveals profound alterations in the myeloid cell compartment associated with severe COVID-19.
    • Evolutionary Stabilization of Cooperative Toxin Production through a Bacterium-Plasmid-Phage Interplay.

      Spriewald, Stefanie; Stadler, Eva; Hense, Burkhard A; Münch, Philipp C; McHardy, Alice C; Weiss, Anna S; Obeng, Nancy; Müller, Johannes; Stecher, Bärbel; BRICS, Braunschweiger Zentrum für Systembiologie, Rebenring 56,38106 Braunschweig, Germany. (ASM, 2020-07-21)
      Colicins are toxins produced and released by Enterobacteriaceae to kill competitors in the gut. While group A colicins employ a division of labor strategy to liberate the toxin into the environment via colicin-specific lysis, group B colicin systems lack cognate lysis genes. In Salmonella enterica serovar Typhimurium (S. Tm), the group B colicin Ib (ColIb) is released by temperate phage-mediated bacteriolysis. Phage-mediated ColIb release promotes S. Tm fitness against competing Escherichia coli It remained unclear how prophage-mediated lysis is realized in a clonal population of ColIb producers and if prophages contribute to evolutionary stability of toxin release in S. Tm. Here, we show that prophage-mediated lysis occurs in an S. Tm subpopulation only, thereby introducing phenotypic heterogeneity to the system. We established a mathematical model to study the dynamic interplay of S. Tm, ColIb, and a temperate phage in the presence of a competing species. Using this model, we studied long-term evolution of phage lysis rates in a fluctuating infection scenario. This revealed that phage lysis evolves as bet-hedging strategy that maximizes phage spread, regardless of whether colicin is present or not. We conclude that the ColIb system, lacking its own lysis gene, is making use of the evolutionary stable phage strategy to be released. Prophage lysis genes are highly prevalent in nontyphoidal Salmonella genomes. This suggests that the release of ColIb by temperate phages is widespread. In conclusion, our findings shed new light on the evolution and ecology of group B colicin systems.IMPORTANCE Bacteria are excellent model organisms to study mechanisms of social evolution. The production of public goods, e.g., toxin release by cell lysis in clonal bacterial populations, is a frequently studied example of cooperative behavior. Here, we analyze evolutionary stabilization of toxin release by the enteric pathogen Salmonella The release of colicin Ib (ColIb), which is used by Salmonella to gain an edge against competing microbiota following infection, is coupled to bacterial lysis mediated by temperate phages. Here, we show that phage-dependent lysis and subsequent release of colicin and phage particles occurs only in part of the ColIb-expressing Salmonella population. This phenotypic heterogeneity in lysis, which represents an essential step in the temperate phage life cycle, has evolved as a bet-hedging strategy under fluctuating environments such as the gastrointestinal tract. Our findings suggest that prophages can thereby evolutionarily stabilize costly toxin release in bacterial populations.
    • Evaluating assembly and variant calling software for strain-resolved analysis of large DNA viruses

      Deng, Zhi-Luo; Dhingra, Akshay; Fritz, Adrian; Götting, Jasper; Münch, Philipp C; Steinbrück, Lars; Schulz, Thomas F; Ganzenmüller, Tina; McHardy, Alice C; BRICS, Braunschweiger Zentrum für Systembiologie, Rebenring 56,38106 Braunschweig, Germany. (Oxford University Press (OUP), 2020-07-07)
      Infection with human cytomegalovirus (HCMV) can cause severe complications in immunocompromised individuals and congenitally infected children. Characterizing heterogeneous viral populations and their evolution by high-throughput sequencing of clinical specimens requires the accurate assembly of individual strains or sequence variants and suitable variant calling methods. However, the performance of most methods has not been assessed for populations composed of low divergent viral strains with large genomes, such as HCMV. In an extensive benchmarking study, we evaluated 15 assemblers and 6 variant callers on 10 lab-generated benchmark data sets created with two different library preparation protocols, to identify best practices and challenges for analyzing such data. Most assemblers, especially metaSPAdes and IVA, performed well across a range of metrics in recovering abundant strains. However, only one, Savage, recovered low abundant strains and in a highly fragmented manner. Two variant callers, LoFreq and VarScan2, excelled across all strain abundances. Both shared a large fraction of false positive variant calls, which were strongly enriched in T to G changes in a ‘G.G’ context. The magnitude of this context-dependent systematic error is linked to the experimental protocol. We provide all benchmarking data, results and the entire benchmarking workflow named QuasiModo, Quasispecies Metric determination on omics, under the GNU General Public License v3.0 (https://github.com/hzi-bifo/Quasimodo), to enable full reproducibility and further benchmarking on these and other data.
    • YBX1 Indirectly Targets Heterochromatin-Repressed Inflammatory Response-Related Apoptosis Genes through Regulating CBX5 mRNA.

      Kloetgen, Andreas; Duggimpudi, Sujitha; Schuschel, Konstantin; Hezaveh, Kebria; Picard, Daniel; Schaal, Heiner; Remke, Marc; Klusmann, Jan-Henning; Borkhardt, Arndt; McHardy, Alice C; et al. (MDPI, 2020-06-23)
      Medulloblastomas arise from undifferentiated precursor cells in the cerebellum and account for about 20% of all solid brain tumors during childhood; standard therapies include radiation and chemotherapy, which oftentimes come with severe impairment of the cognitive development of the young patients. Here, we show that the posttranscriptional regulator Y-box binding protein 1 (YBX1), a DNA- and RNA-binding protein, acts as an oncogene in medulloblastomas by regulating cellular survival and apoptosis. We observed different cellular responses upon YBX1 knockdown in several medulloblastoma cell lines, with significantly altered transcription and subsequent apoptosis rates. Mechanistically, PAR-CLIP for YBX1 and integration with RNA-Seq data uncovered direct posttranscriptional control of the heterochromatin-associated gene CBX5; upon YBX1 knockdown and subsequent CBX5 mRNA instability, heterochromatin-regulated genes involved in inflammatory response, apoptosis and death receptor signaling were de-repressed. Thus, YBX1 acts as an oncogene in medulloblastoma through indirect transcriptional regulation of inflammatory genes regulating apoptosis and represents a promising novel therapeutic target in this tumor entity.
    • Functional omics analyses reveal only minor effects of microRNAs on human somatic stem cell differentiation.

      Schira-Heinen, Jessica; Czapla, Agathe; Hendricks, Marion; Kloetgen, Andreas; Wruck, Wasco; Adjaye, James; Kögler, Gesine; Werner Müller, Hans; Stühler, Kai; Trompeter, Hans-Ingo; et al. (NPG, 2020-02-24)
      The contribution of microRNA-mediated posttranscriptional regulation on the final proteome in differentiating cells remains elusive. Here, we evaluated the impact of microRNAs (miRNAs) on the proteome of human umbilical cord blood-derived unrestricted somatic stem cells (USSC) during retinoic acid (RA) differentiation by a systemic approach using next generation sequencing analysing mRNA and miRNA expression and quantitative mass spectrometry-based proteome analyses. Interestingly, regulation of mRNAs and their dedicated proteins highly correlated during RA-incubation. Additionally, RA-induced USSC demonstrated a clear separation from native USSC thereby shifting from a proliferating to a metabolic phenotype. Bioinformatic integration of up- and downregulated miRNAs and proteins initially implied a strong impact of the miRNome on the XXL-USSC proteome. However, quantitative proteome analysis of the miRNA contribution on the final proteome after ectopic overexpression of downregulated miR-27a-5p and miR-221-5p or inhibition of upregulated miR-34a-5p, respectively, followed by RA-induction revealed only minor proportions of differentially abundant proteins. In addition, only small overlaps of these regulated proteins with inversely abundant proteins in non-transfected RA-treated USSC were observed. Hence, mRNA transcription rather than miRNA-mediated regulation is the driving force for protein regulation upon RA-incubation, strongly suggesting that miRNAs are fine-tuning regulators rather than active primary switches during RA-induction of USSC.
    • Eleven grand challenges in single-cell data science.

      Lähnemann, David; Köster, Johannes; Szczurek, Ewa; McCarthy, Davis J; Hicks, Stephanie C; Robinson, Mark D; Vallejos, Catalina A; Campbell, Kieran R; Beerenwinkel, Niko; Mahfouz, Ahmed; et al. (BMC, 2020-02-07)
      The recent boom in microfluidics and combinatorial indexing strategies, combined with low sequencing costs, has empowered single-cell sequencing technology. Thousands-or even millions-of cells analyzed in a single experiment amount to a data revolution in single-cell biology and pose unique data science problems. Here, we outline eleven challenges that will be central to bringing this emerging field of single-cell data science forward. For each challenge, we highlight motivating research questions, review prior work, and formulate open problems. This compendium is for established researchers, newcomers, and students alike, highlighting interesting and rewarding problems for the coming years.
    • Phylogeographic reconstruction using air transportation data and its application to the 2009 H1N1 influenza A pandemic.

      Reimering, Susanne; Muñoz, Sebastian; McHardy, Alice C; BRICS, Braunschweiger Zentrum für Systembiologie, Rebenring 56,38106 Braunschweig, Germany. (PLOS, 2020-02-01)
      Influenza A viruses cause seasonal epidemics and occasional pandemics in the human population. While the worldwide circulation of seasonal influenza is at least partly understood, the exact migration patterns between countries, states or cities are not well studied. Here, we use the Sankoff algorithm for parsimonious phylogeographic reconstruction together with effective distances based on a worldwide air transportation network. By first simulating geographic spread and then phylogenetic trees and genetic sequences, we confirmed that reconstructions with effective distances inferred phylogeographic spread more accurately than reconstructions with geographic distances and Bayesian reconstructions with BEAST that do not use any distance information, and led to comparable results to the Bayesian reconstruction using distance information via a generalized linear model. Our method extends Bayesian methods that estimate rates from the data by using fine-grained locations like airports and inferring intermediate locations not observed among sampled isolates. When applied to sequence data of the pandemic H1N1 influenza A virus in 2009, our approach correctly inferred the origin and proposed airports mainly involved in the spread of the virus. In case of a novel outbreak, this approach allows to rapidly analyze sequence data and infer origin and spread routes to improve disease surveillance and control.
    • CAMITAX: Taxon labels for microbial genomes.

      Bremges, Andreas; Fritz, Adrian; McHardy, Alice C; BRICS, Braunschweiger Zentrum für Systembiologie, Rebenring 56,38106 Braunschweig, Germany. (Oxford Academic, 2020-01-01)
      BACKGROUND: The number of microbial genome sequences is increasing exponentially, especially thanks to recent advances in recovering complete or near-complete genomes from metagenomes and single cells. Assigning reliable taxon labels to genomes is key and often a prerequisite for downstream analyses. FINDINGS: We introduce CAMITAX, a scalable and reproducible workflow for the taxonomic labelling of microbial genomes recovered from isolates, single cells, and metagenomes. CAMITAX combines genome distance-, 16S ribosomal RNA gene-, and gene homology-based taxonomic assignments with phylogenetic placement. It uses Nextflow to orchestrate reference databases and software containers and thus combines ease of installation and use with computational reproducibility. We evaluated the method on several hundred metagenome-assembled genomes with high-quality taxonomic annotations from the TARA Oceans project, and we show that the ensemble classification method in CAMITAX improved on all individual methods across tested ranks. CONCLUSIONS: While we initially developed CAMITAX to aid the Critical Assessment of Metagenome Interpretation (CAMI) initiative, it evolved into a comprehensive software package to reliably assign taxon labels to microbial genomes. CAMITAX is available under Apache License 2.0 at https://github.com/CAMI-challenge/CAMITAX.
    • The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

      Zhou, Naihui; Jiang, Yuxiang; Bergquist, Timothy R; Lee, Alexandra J; Kacsoh, Balint Z; Crocker, Alex W; Lewis, Kimberley A; Georghiou, George; Nguyen, Huy N; Hamid, Md Nafiz; et al. (BMC, 2019-11-19)
      BACKGROUND: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function. RESULTS: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory. CONCLUSION: We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.
    • Pediatric ALL relapses after allo-SCT show high individuality, clonal dynamics, selective pressure, and druggable targets.

      Hoell, Jessica I; Ginzel, Sebastian; Kuhlen, Michaela; Kloetgen, Andreas; Gombert, Michael; Fischer, Ute; Hein, Daniel; Demir, Salih; Stanulla, Martin; Schrappe, Martin; et al. (American Society of Haematology, 2019-10-22)
      Survival of patients with pediatric acute lymphoblastic leukemia (ALL) after allogeneic hematopoietic stem cell transplantation (allo-SCT) is mainly compromised by leukemia relapse, carrying dismal prognosis. As novel individualized therapeutic approaches are urgently needed, we performed whole-exome sequencing of leukemic blasts of 10 children with post-allo-SCT relapses with the aim of thoroughly characterizing the mutational landscape and identifying druggable mutations. We found that post-allo-SCT ALL relapses display highly diverse and mostly patient-individual genetic lesions. Moreover, mutational cluster analysis showed substantial clonal dynamics during leukemia progression from initial diagnosis to relapse after allo-SCT. Only very few alterations stayed constant over time. This dynamic clonality was exemplified by the detection of thiopurine resistance-mediating mutations in the nucleotidase NT5C2 in 3 patients' first relapses, which disappeared in the post-allo-SCT relapses on relief of selective pressure of maintenance chemotherapy. Moreover, we identified TP53 mutations in 4 of 10 patients after allo-SCT, reflecting acquired chemoresistance associated with selective pressure of prior antineoplastic treatment. Finally, in 9 of 10 children's post-allo-SCT relapse, we found alterations in genes for which targeted therapies with novel agents are readily available. We could show efficient targeting of leukemic blasts by APR-246 in 2 patients carrying TP53 mutations. Our findings shed light on the genetic basis of post-allo-SCT relapse and may pave the way for unraveling novel therapeutic strategies in this challenging situation.
    • Structures and functions linked to genome-wide adaptation of human influenza A viruses.

      Klingen, Thorsten R; Loers, Jens; Stanelle-Bertram, Stephanie; Gabriel, Gülsah; McHardy, Alice C; BRICS, Braunschweiger Zentrum für Systembiologie, Rebenring 56,38106 Braunschweig, Germany. (Springer-Nature, 2019-04-18)
      Human influenza A viruses elicit short-term respiratory infections with considerable mortality and morbidity. While H3N2 viruses circulate for more than 50 years, the recent introduction of pH1N1 viruses presents an excellent opportunity for a comparative analysis of the genome-wide evolutionary forces acting on both subtypes. Here, we inferred patches of sites relevant for adaptation, i.e. being under positive selection, on eleven viral protein structures, from all available data since 1968 and correlated these with known functional properties. Overall, pH1N1 have more patches than H3N2 viruses, especially in the viral polymerase complex, while antigenic evolution is more apparent for H3N2 viruses. In both subtypes, NS1 has the highest patch and patch site frequency, indicating that NS1-mediated viral attenuation of host inflammatory responses is a continuously intensifying process, elevated even in the longtime-circulating subtype H3N2. We confirmed the resistance-causing effects of two pH1N1 changes against oseltamivir in NA activity assays, demonstrating the value of the resource for discovering functionally relevant changes. Our results represent an atlas of protein regions and sites with links to host adaptation, antiviral drug resistance and immune evasion for both subtypes for further study.