Multi-cohort ensemble learning framework for vaginal microbiome-based endometrial cancer detection

Dollina Dodani & Aline Talhouk · 2025-12-08

Introduction

Endometrial cancer is the most common gynecological malignancy in high-income countries and lacks an established strategy for early detection. Prior studies suggest that the vaginal microbiome may hold diagnostic potential, but inconsistent findings have limited clinical translation.

Methods

We conducted a systematic review to collect and analyze vaginal 16S rRNA sequencing data from five independent cohorts (n = 265). These studies included women with histologically confirmed endometrial cancer and controls with benign gynecologic conditions. We used these datasets to identify microbial signatures associated with endometrial cancer and to develop a predictive machine learning model.

Results

Microbial diversity was significantly higher in endometrial cancer samples, and host characteristics influenced community composition. Peptoniphilus was reproducibly enriched in cancer samples across cohorts. An ensemble classifier accurately identified endometrial cancer in a held-out test set, achieving an area under the receiver operating characteristic curve of 0.93 (95% CI: 0.71–0.93), sensitivity of 1.0 (95% CI: 0.74–1.0), and a negative predictive value of 1.0 (95% CI: 0.59–1.0).

Discussion

These findings support the potential of vaginal microbiome profiling as a minimally invasive approach for early detection of endometrial cancer.

Authors
Dollina Dodani, Aline Talhouk