Investigator

James R. White

Johns Hopkins University

JRWJames R. White
Papers(2)
Genome-wide repeat la…Utilizing Serum-Deriv…
Collaborators(10)
Jamie E. MedinaJillian PhallenKian BehbakhtLeonardo N. HagmannMaria WongMattie GoldbergMoises ZapataNoushin NiknafsRachel Culp-HillRobert A. Law
Institutions(3)
Johns Hopkins Univers…University of Colorad…Unknown Institution

Papers

Genome-wide repeat landscapes in cancer and cell-free DNA

Genetic changes in repetitive sequences are a hallmark of cancer and other diseases, but characterizing these has been challenging using standard sequencing approaches. We developed a de novo kmer finding approach, called ARTEMIS (Analysis of RepeaT EleMents in dISease), to identify repeat elements from whole-genome sequencing. Using this method, we analyzed 1.2 billion kmers in 2837 tissue and plasma samples from 1975 patients, including those with lung, breast, colorectal, ovarian, liver, gastric, head and neck, bladder, cervical, thyroid, or prostate cancer. We identified tumor-specific changes in these patients in 1280 repeat element types from the LINE, SINE, LTR, transposable element, and human satellite families. These included changes to known repeats and 820 elements that were not previously known to be altered in human cancer. Repeat elements were enriched in regions of driver genes, and their representation was altered by structural changes and epigenetic states. Machine learning analyses of genome-wide repeat landscapes and fragmentation profiles in cfDNA detected patients with early-stage lung or liver cancer in cross-validated and externally validated cohorts. In addition, these repeat landscapes could be used to noninvasively identify the tissue of origin of tumors. These analyses reveal widespread changes in repeat landscapes of human cancers and provide an approach for their detection and characterization that could benefit early detection and disease monitoring of patients with cancer.

Utilizing Serum-Derived Lipidomics with Protein Biomarkers and Machine Learning for Early Detection of Ovarian Cancer in the Symptomatic Population

Abstract Ovarian cancer is the fifth leading cause of cancer-related deaths among women. Most patients are diagnosed at late stage (III/IV), resulting in a 5-year survival rate below 30%. This is driven by the presentation of vague abdominal symptoms that confound diagnosis at early stages (I/II) and a shortage of robust biomarkers. We are taking a novel approach for earlier ovarian cancer detection, leveraging lipids as biomarkers. We utilized untargeted ultrahigh pressure liquid chromatography–mass spectrometry to analyze sera from two large, independent cohorts (N = 433 and N = 399) designed to reflect the symptomatic population, including individuals with benign adnexal masses, early- and late-stage ovarian cancer, gastrointestinal disorders, and otherwise healthy women seeking care for symptoms. We identified a significantly altered lipid profile in ovarian cancer and early-stage ovarian cancer specifically across both cohorts compared with controls. We also profiled select protein biomarkers (cancer antigen 125, human epididymis protein 4, β-2 folate receptor α, and mucin 1) and, utilizing machine learning–based modeling, identified a proof-of-concept multiomic model consisting of less than 20 top-performing lipid and protein features. This model was trained on cohort 1 and tested on cohort 2, achieving AUCs of 92% (95% confidence interval, 87%–95%) for distinguishing ovarian cancer from controls and 88% (95% confidence interval, 83%–93%) for distinguishing early-stage ovarian cancer from controls. These findings demonstrate the clinical utility and robustness of lipids as proof-of-concept diagnostic biomarkers for early ovarian cancer within the clinically complex symptomatic population, particularly when applied in a multiomic approach. Significance: Patients with ovarian cancer endure delayed diagnosis and poor outcomes. We profiled lipids in two cohorts and integrated them with proteins in machine learning. This enabled early-stage detection in a complex range of controls.

1Works
2Papers
30Collaborators