Investigator

Ben Van Calster

Leids Universitair Medisch Centrum, Department of Biomedical Data Sciences

BVCBen Van Calster
Papers(5)
Diagnostic tests for …Comparison of the ADN…Multiclass risk model…Head-to-head comparis…Developing risk model…
Collaborators(10)
Dirk TimmermanClare DavenportGary Stephen CollinsJan Yvan Jos VerbakelJolien CeustersJon DeeksKatie ScandrettLasai BarreñadaLauren SturdyLaure Wynants
Institutions(3)
Ku LeuvenThe University of Bir…The University of Bir…

Papers

Diagnostic tests for ovarian cancer in premenopausal women with non-specific symptoms (ROCkeTS): prospective, multicentre, cohort study

Abstract Objective To investigate the accuracy of risk prediction models and scores for diagnosing ovarian cancer in premenopausal women presenting to secondary care with symptoms and abnormal test results. Design Prospective cohort study. Setting Secondary care in 23 hospitals in the UK between June 2015 and March 2023. Participants Premenopausal women presenting with non-specific symptoms, and raised serum levels of cancer antigen 125 or abnormal imaging results, were prospectively recruited, predominantly referred through the NHS urgent suspected cancer pathway from primary care. A head-to-head comparison of the accuracy of the six risk prediction models and scores was conducted using donated blood and ultrasound scans performed by NHS staff trained in the use of International Ovarian Tumour Analysis (IOTA) imaging terminology. The index tests used were Risk of Malignancy Index 1 (with pre-stated thresholds of 200, 250), Risk of Malignancy Algorithm (7.4%, 11.4%, 12.5%, 13.1%), IOTA Assessment of Different Neoplasias in the adnEXa (ADNEX) (3%, 10%), IOTA simple rules risk model (3%, 10%), IOTA simple rules, and cancer antigen 125 (CA 125, 87 IU/mL). Participants were classified as having primary invasive ovarian cancer versus having benign or normal pathology according to the reference standard determined from surgical specimens or biopsies by histology or cytology, if undertaken, or else at 12 month follow-up. After June 2018, because of covid restrictions and concerns about sample size, recruitment was restricted to only women undergoing surgery within three months of presentation to clinic (in whom ovarian cancer was more likely). Main outcome measures Diagnostic accuracy at predicting primary invasive ovarian cancer versus benign or normal histology, assessed by analysing the sensitivity, specificity, C index, area under receiver operating characteristic curve, positive and negative predictive values, and calibration plots in participants with conclusive reference standard results and available index test data. Results 88 of 1211 premenopausal women received diagnoses of primary ovarian cancer: 49 of 857 women in the pre-June 2018 cohort (prevalence of 5.7%) and 39 of 354 women in the post-June 2018 cohort (11.0%). For the diagnosis of primary ovarian cancer (n=799 women, after exclusion of 58 other diagnoses), Risk of Malignancy Index 1 at the 250 threshold had a sensitivity of 42.6% (95% confidence interval (CI) 28.3 to 57.8; specificity 96.5%, 94.7 to 97.8). Compared with Risk of Malignancy Index 1 at the 250 threshold, CA 125 and all other tests had higher sensitivity (CA 125 at 87 IU/mL threshold: 55.1%, 40.2 to 69.3, P=0.06; Risk of Malignancy Algorithm at 11.4% threshold: 79.2%, 65.0 to 89.5, P<0.001; IOTA ADNEX at 10% threshold: 89.1%, 76.4 to 96.4, P<0.001; IOTA simple rules risk at 10% threshold: 83.0%, 69.2 to 92.4, P<0.001; IOTA simple rules: 75.0%, 56.6 to 88.5, P=0.01) and lower specificity (CA 125 at 87 IU/mL threshold: 89.0%, 86.5 to 91.2, P<0.001; Risk of Malignancy Algorithm at 11.4% threshold: 73.1%, 69.6 to 76.3, P<0.001; IOTA ADNEX at 10% threshold: 75.1%, 71.4 to 78.6, P<0.001; IOTA simple rules risk at 10% threshold: 76.0%, 72.4 to 79.3, P<0.001; IOTA simple rules: 95.2%, 93.0 to 96.9, P=0.06). Results for IOTA simple rules were inconclusive in 120 of 799 participants. Analysis of the complete cohort (n=1211), including the 354 premenopausal women with a higher likelihood of developing ovarian cancer, yielded similar results. Conclusions Compared to Risk of Malignancy Index 1 at 250 threshold—the test currently used in NHS secondary care to triage women to tertiary care—most tests improve sensitivity but reduce specificity. Ultrasound triage with the IOTA ADNEX model at 10% in secondary care demonstrated the highest sensitivity gain, with a comparable decline in specificity to other comparator tests. Ultrasound with the IOTA ADNEX model at 10% should be considered the new standard of care test for triaging premenopausal women in secondary care. Implementation should incorporate staff training and quality assurance. Trial registration ISRCTN17160843 .

Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm

Abstract Background Assessing malignancy risk is important to choose appropriate management of ovarian tumors. We compared six algorithms to estimate the probabilities that an ovarian tumor is benign, borderline malignant, stage I primary invasive, stage II-IV primary invasive, or secondary metastatic. Methods This retrospective cohort study used 5909 patients recruited from 1999 to 2012 for model development, and 3199 patients recruited from 2012 to 2015 for model validation. Patients were recruited at oncology referral or general centers and underwent an ultrasound examination and surgery ≤ 120 days later. We developed models using standard multinomial logistic regression (MLR), Ridge MLR, random forest (RF), XGBoost, neural networks (NN), and support vector machines (SVM). We used nine clinical and ultrasound predictors but developed models with or without CA125. Results Most tumors were benign (3980 in development and 1688 in validation data), secondary metastatic tumors were least common (246 and 172). The c-statistic (AUROC) to discriminate benign from any type of malignant tumor ranged from 0.89 to 0.92 for models with CA125, from 0.89 to 0.91 for models without. The multiclass c-statistic ranged from 0.41 (SVM) to 0.55 (XGBoost) for models with CA125, and from 0.42 (SVM) to 0.51 (standard MLR) for models without. Multiclass calibration was best for RF and XGBoost. Estimated probabilities for a benign tumor in the same patient often differed by more than 0.2 (20% points) depending on the model. Net Benefit for diagnosing malignancy was similar for algorithms at the commonly used 10% risk threshold, but was slightly higher for RF at higher thresholds. Comparing models, between 3% (XGBoost vs. NN, with CA125) and 30% (NN vs. SVM, without CA125) of patients fell on opposite sides of the 10% threshold. Conclusion Although several models had similarly good performance, individual probability estimates varied substantially.

Head-to-head comparison of the RMI and ADNEX models to estimate the risk of ovarian malignancy: a systematic review and meta-analysis of external validation studies

Objectives Assessment of Different NEoplasias in the adneXa (ADNEX) and Risk of Malignancy Index (RMI) are models that estimate the risk of malignancy in ovarian masses based on clinical and ultrasound information. The aim is to perform a meta-analysis of studies that compared the performance of the two models in the same patients (‘head-to-head comparison’). Design Systematic review and meta-analysis. Data sources Systematic literature search from publication of ADNEX model (15/10/2014) up to 31/07/2024 in Embase, Web of Science, Scopus, Medline (via PubMed) and EuropePMC. Eligibility criteria for selecting studies We included all studies that externally validated the performance of ADNEX (with or without CA125) and RMI on the same data. Data extraction and synthesis Two independent reviewers extracted data using a standardised extraction sheet. We assessed risk of bias using PROBAST. We performed random effects meta-analysis of the area under the receiver operating characteristic curve (AUC), sensitivity, specificity and clinical utility (net benefit, relative utility and probability of being useful in a hypothetical new centre) at thresholds commonly used clinically (10% risk of malignancy for ADNEX, 200 for RMI). Results We included 11 studies comprising 8271 tumours. Most studies were at high risk of bias. The summary AUC to distinguish benign from malignant tumours in operated patients for ADNEX with CA125 was 0.92 (95% CI 0.90 to 0.94) and for RMI it was 0.85 (0.81 to 0.89). Sensitivity and specificity for ADNEX with CA125 were 0.93 (0.90 to 0.96) and 0.77 (0.71 to 0.81) and for RMI, they were 0.61 (0.56 to 0.67) and 0.92 (0.89 to 0.94). The probability of the test being useful in a hypothetical new centre in operated patients was 96% for ADNEX with CA125 and 15% for RMI at the selected thresholds. Conclusions ADNEX has better discrimination and clinical utility than RMI.

Developing risk models for multicenter data using standard logistic regression produced suboptimal predictions: A simulation study

AbstractAlthough multicenter data are common, many prediction model studies ignore this during model development. The objective of this study is to evaluate the predictive performance of regression methods for developing clinical risk prediction models using multicenter data, and provide guidelines for practice. We compared the predictive performance of standard logistic regression, generalized estimating equations, random intercept logistic regression, and fixed effects logistic regression. First, we presented a case study on the diagnosis of ovarian cancer. Subsequently, a simulation study investigated the performance of the different models as a function of the amount of clustering, development sample size, distribution of center‐specific intercepts, the presence of a center‐predictor interaction, and the presence of a dependency between center effects and predictors. The results showed that when sample sizes were sufficiently large, conditional models yielded calibrated predictions, whereas marginal models yielded miscalibrated predictions. Small sample sizes led to overfitting and unreliable predictions. This miscalibration was worse with more heavily clustered data. Calibration of random intercept logistic regression was better than that of standard logistic regression even when center‐specific intercepts were not normally distributed, a center‐predictor interaction was present, center effects and predictors were dependent, or when the model was applied in a new center. Therefore, to make reliable predictions in a specific center, we recommend random intercept logistic regression.

Clinical Trials (2)

NCT01698632KU Leuven

International Ovarian Tumour Analysis (IOTA) Phase 5

The purpose of this study is to learn more about the appearance and behavior of benign-looking adnexal masses. * Benign-looking means that when viewed here by ultrasound it has the appearance of looking not harmful or not malignant. * Adnexal refers to the 'adnexa', the space in the female pelvis on either side of the uterus (or where the uterus used to be if you previously had a hysterectomy). The adnexa includes, but is not limited to, the ovaries and the fallopian tubes. * Masses refers to a variety of structures, including but not limited to: * ovarian cysts that are fluid filled sacs within or attached to an ovary * ovarian tumors that can be solid tissue or a combination of cysts and solid tissue * hydrosalpinges that are fluid collections in the fallopian tube Many women have what appear to be benign adnexal masses. Many times, removal of the masses with surgery is not necessary. Often surgery is performed unnecessarily, for fear that these masses could be cancer. There is not much information available for doctors to know how and when to follow these masses, or which ones will become cancer. This study will combine information from centers all around the world regarding the behavior of all types of benign adnexal masses. The aim of this study is to develop decision tools for doctors to know the best way to treat these masses in order to improve the detection of ovarian cancer while at the same time reduce the number of unnecessary operations.

145Works
5Papers
24Collaborators
2Trials
PrognosisOvarian NeoplasmsAdnexal DiseasesCardiovascular DiseasesStress Disorders, Post-TraumaticInflammatory Bowel DiseasesBreast NeoplasmsPrenatal Diagnosis

Positions

Researcher

Leids Universitair Medisch Centrum · Department of Biomedical Data Sciences

Researcher

KU Leuven · Department of Development and Regeneration

Education

KU Leuven