Journal
CNRein: an evolution-aware deep reinforcement learning algorithm for single-cell DNA copy number calling
Abstract Low-pass single-cell DNA sequencing technologies and algorithmic advancements have enabled haplotype-specific copy number calling on thousands of cells within tumors. However, measurement uncertainty may result in spurious CNAs inconsistent with realistic evolutionary constraints. We introduce evolution-aware copy number calling via deep reinforcement learning (CNRein). Our simulations demonstrate CNRein infers more accurate copy-number profiles and better recapitulates ground truth clonal structure than existing methods. On sequencing data of breast and ovarian cancer, CNRein produces more parsimonious solutions than existing methods while maintaining agreement with single-nucleotide variants. Additionally, CNRein shows consistency on a breast cancer patient sequenced with distinct low-pass technologies.
Performance of computational algorithms to deconvolve heterogeneous bulk ovarian tumor tissue depends on experimental factors
Abstract Background Single-cell gene expression profiling provides unique opportunities to understand tumor heterogeneity and the tumor microenvironment. Because of cost and feasibility, profiling bulk tumors remains the primary population-scale analytical strategy. Many algorithms can deconvolve these tumors using single-cell profiles to infer their composition. While experimental choices do not change the true underlying composition of the tumor, they can affect the measurements produced by the assay. Results We generated a dataset of high-grade serous ovarian tumors with paired expression profiles from using multiple strategies to examine the extent to which experimental factors impact the results of downstream tumor deconvolution methods. We find that pooling samples for single-cell sequencing and subsequent demultiplexing has a minimal effect. We identify dissociation-induced differences that affect cell composition, leading to changes that may compromise the assumptions underlying some deconvolution algorithms. We also observe differences across mRNA enrichment methods that introduce additional discrepancies between the two data types. We also find that experimental factors change cell composition estimates and that the impact differs by method. Conclusions Previous benchmarks of deconvolution methods have largely ignored experimental factors. We find that methods vary in their robustness to experimental factors. We provide recommendations for methods developers seeking to produce the next generation of deconvolution approaches and for scientists designing experiments using deconvolution to study tumor heterogeneity.
iMOKA: k-mer based software to analyze large collections of sequencing data
AbstractiMOKA (interactive multi-objective k-mer analysis) is a software that enables comprehensive analysis of sequencing data from large cohorts to generate robust classification models or explore specific genetic elements associated with disease etiology. iMOKA uses a fast and accurate feature reduction step that combines a Naïve Bayes classifier augmented by an adaptive entropy filter and a graph-based filter to rapidly reduce the search space. By using a flexible file format and distributed indexing, iMOKA can easily integrate data from multiple experiments and also reduces disk space requirements and identifies changes in transcript levels and single nucleotide variants. iMOKA is available at https://github.com/RitchieLabIGH/iMOKA and Zenodo 10.5281/zenodo.4008947.
Ubiquitination degradation of GATA4 by CUL4B promotes ovarian cancer metastasis by inducing lysosomal acidification
G-quadruplex structures regulate long-range transcriptional reprogramming to promote drug resistance in ovarian cancer cells
Abstract Background Epigenetic evolution is a common mechanism used by cancer cells to evade the therapeutic effects of drug treatment. In ovarian cancers, epigenetically driven resistance is thought to be responsible for many late-stage patient deaths. DNA secondary structures called G-quadruplexes (G4s) are emerging as potential epigenetic marks of relevance to cancer evolution, but their prevalence and distribution in ovarian cancer models have never been investigated before. Results Here, we describe the first investigation of the role of G4s in the epigenetic regulation of drug-resistant ovarian cancer cells. Through genome-wide mapping of G4s in paired drug-sensitive and drug-resistant cell lines, we find that increased G4 accumulation is associated with enhanced transcription of signalling pathways previously established to promote drug-resistant states, including genes involved in the epithelial to mesenchymal transition and WNT signalling. In contrast to previous studies, the expression-enhancing effects of G4s are not found at gene promoters, but intergenic and intronic regions, indicating that G4s can promote long-range transcriptional regulation in drug-resistant cells. Furthermore, we discover that clusters of G4s (super-G4s) are associated with particularly high levels of transcriptional enhancement that surpass the effects of super-enhancers, which act as well-established regulatory sites in many cancers. Finally, we demonstrate that targeting G4s with small molecules results in significant downregulation of pathways associated with drug resistance, resulting in resensitization of resistant cells to chemotherapy agents. Conclusions These findings indicate that G4 structures are critical for the epigenetic regulatory networks of drug-resistant cells and represent a promising target to treat drug-tolerant ovarian cancer.
Hybrid BAG-seq: DNA and RNA from the same single nucleus reveals interactions between genomic and transcriptomic landscapes in human tumor samples
We introduce hybrid BAG-seq: a high-throughput, multi-omic method that simultaneously captures DNA and RNA from single nuclei. We apply this protocol to 65,499 single nuclei from samples of five uterine cancer patients and validate the clustering using RNA-only and DNA-only protocols from the same tissues. Multiple tumor genome or expression clusters are often present within a patient, with different tumor clones projecting into distinct or shared expression states, demonstrating nearly all possible genome-transcriptome correlations. We also identify mutant stroma with significant X chromosome loss in various cell types and patient-specific stromal subtypes exhibiting aberrant expression patterns.
SPARK-X: non-parametric modeling enables scalable and robust detection of spatial expression patterns for large spatial transcriptomic studies
AbstractSpatial transcriptomic studies are becoming increasingly common and large, posing important statistical and computational challenges for many analytic tasks. Here, we present SPARK-X, a non-parametric method for rapid and effective detection of spatially expressed genes in large spatial transcriptomic studies. SPARK-X not only produces effective type I error control and high power but also brings orders of magnitude computational savings. We apply SPARK-X to analyze three large datasets, one of which is only analyzable by SPARK-X. In these data, SPARK-X identifies many spatially expressed genes including those that are spatially expressed within the same cell type, revealing new biological insights.
Endometrial tumorigenesis involves epigenetic plasticity demarcating non-coding somatic mutations and 3D-genome alterations
The incidence and mortality of endometrial cancer (EC) is on the rise. Eighty-five percent of ECs depend on estrogen receptor alpha (ERα) for proliferation, but little is known about its transcriptional regulation in these tumors. We generate epigenomics, transcriptomics, and Hi-C datastreams in healthy and tumor endometrial tissues, identifying robust ERα reprogramming and profound alterations in 3D genome organization that lead to a gain of tumor-specific enhancer activity during EC development. Integration with endometrial cancer risk single-nucleotide polymorphisms and whole-genome sequencing data from primary tumors and metastatic samples reveals a striking enrichment of risk variants and non-coding somatic mutations at tumor-enriched ERα sites. Through machine learning-based predictions and interaction proteomics analyses, we identify an enhancer mutation which alters 3D genome conformation, impairing recruitment of the transcriptional repressor EHMT2/G9a/KMT1C, thereby alleviating transcriptional repression of ESR1 in EC. In summary, we identify a complex genomic-epigenomic interplay in EC development and progression, altering 3D genome organization to enhance expression of the critical driver ERα.
Springer Science and Business Media LLC
1474-760X