Multitask Deep Learning for Automated Segmentation and Prognostic Stratification of Endometrial Cancer via Biparametric MRI

Ning Lang · 2025-06-19

ABSTRACT

Background

Endometrial cancer (EC) is a common gynecologic malignancy; accurate assessment of key prognostic factors is important for treatment planning.

Purpose

To develop a deep learning (DL) framework based on biparametric MRI for automated segmentation and multitask classification of EC key prognostic factors, including grade, stage, histological subtype, lymphovascular space invasion (LVSI), and deep myometrial invasion (DMI).

Study Type

Retrospective.

Subjects

A total of 325 patients with histologically confirmed EC were included: 211 training, 54 validation, and 60 test cases.

Field Strength/Sequence

T2‐weighted imaging (T2WI, FSE/TSE) and diffusion‐weighted imaging (DWI, SS‐EPI) sequences at 1.5 and 3 T.

Assessment

The DL model comprised tumor segmentation and multitask classification. Manual delineation on T2WI and DWI acted as the reference standard for segmentation. Separate models were trained using T2WI alone, DWI alone and combined T2WI + DWI to classify dichotomized key prognostic factors. Performance was assessed in validation and test cohorts. For DMI, the combined model's was compared with visual assessment by four radiologists (with 1, 4, 7, and 20 years' experience), each of whom independently reviewed all cases.

Statistical Tests

Segmentation was evaluated using the dice similarity coefficient (DSC), Jaccard similarity coefficient (JSC), Hausdorff distance (HD95), and average surface distance (ASD). Classification performance was assessed using area under the receiver operating characteristic curve (AUC). Model AUCs were compared using DeLong's test. p < 0.05 was considered significant.

Results

In the test cohort, DSCs were 0.80 (T2WI) and 0.78 (DWI) and JSCs were 0.69 for both. HD95 and ASD were 7.02/1.71 mm (T2WI) versus 10.58/2.13 mm (DWI). The classification framework achieved AUCs of 0.78–0.94 (validation) and 0.74–0.94 (test). For DMI, the combined model performed comparably to radiologists (p = 0.07–0.84).

Conclusions

The unified DL framework demonstrates strong EC segmentation and classification performance, with high accuracy across multiple tasks.

Evidence Level

3.

Technical Efficacy

Stage 3.