Utilizing Serum-Derived Lipidomics with Protein Biomarkers and Machine Learning for Early Detection of Ovarian Cancer in the Symptomatic Population

Brendan M. Giles & Abigail McElhinny et al. · 2025

Abstract

Ovarian cancer is the fifth leading cause of cancer-related deaths among women. Most patients are diagnosed at late stage (III/IV), resulting in a 5-year survival rate below 30%. This is driven by the presentation of vague abdominal symptoms that confound diagnosis at early stages (I/II) and a shortage of robust biomarkers. We are taking a novel approach for earlier ovarian cancer detection, leveraging lipids as biomarkers. We utilized untargeted ultrahigh pressure liquid chromatography–mass spectrometry to analyze sera from two large, independent cohorts (N = 433 and N = 399) designed to reflect the symptomatic population, including individuals with benign adnexal masses, early- and late-stage ovarian cancer, gastrointestinal disorders, and otherwise healthy women seeking care for symptoms. We identified a significantly altered lipid profile in ovarian cancer and early-stage ovarian cancer specifically across both cohorts compared with controls. We also profiled select protein biomarkers (cancer antigen 125, human epididymis protein 4, β-2 folate receptor α, and mucin 1) and, utilizing machine learning–based modeling, identified a proof-of-concept multiomic model consisting of less than 20 top-performing lipid and protein features. This model was trained on cohort 1 and tested on cohort 2, achieving AUCs of 92% (95% confidence interval, 87%–95%) for distinguishing ovarian cancer from controls and 88% (95% confidence interval, 83%–93%) for distinguishing early-stage ovarian cancer from controls. These findings demonstrate the clinical utility and robustness of lipids as proof-of-concept diagnostic biomarkers for early ovarian cancer within the clinically complex symptomatic population, particularly when applied in a multiomic approach.

Significance:

Patients with ovarian cancer endure delayed diagnosis and poor outcomes. We profiled lipids in two cohorts and integrated them with proteins in machine learning. This enabled early-stage detection in a complex range of controls.

Authors
Brendan M. Giles, Rachel Culp-Hill, Robert A. Law, Charles M. Nichols, Mattie Goldberg, Enkhtuya Radnaa, Maria Wong, Connor Hansen, Moises Zapata, Collin Hill, Kian Behbakht, Benjamin G. Bitler, Emma J. Crosbie, Chloe E. Barr, Anna Jeter, Vuna S. Fa, Violeta Beleva Guthrie, Leonardo N. Hagmann, Emily C. Kubota, James Robert White, Abigail McElhinny