BMJ Health & Care Informatics

Leveraging real-world data for continuous evaluation of computational clinical practice guidelines

Objectives There is a bidirectional interaction between clinical practice guidelines and clinical care, with each informing the other. Structural signalling of trends in guideline adherence in clinical practice is essential for advanced updates. Recent advances in computable care guidelines allow automated evaluation using real-world registry data. Here, we assess the feasibility by evaluating adherence to Dutch endometrial cancer (EC) guidelines. Methods This retrospective cohort study uses real-world data of EC patients from the Netherlands Cancer Registry (NCR) between January 2010 and May 2022. The Dutch guideline for EC was parsed into clinical decision trees (CDTs). Primary outcome was guideline adherence for multiple (sub)populations, with secondary outcomes encompassing adherence trends, recommendation implementation pace, non-adherent treatment strategies and impact of additional non-guideline-based patient and tumour characteristics on adherence. Results The Dutch EC guideline was parsed into 10 CDTs, revealing 22 patient and disease characteristics and 46 interventions. NCR data were mapped to CDT data items. Four CDTs were successfully populated with NCR data, and 21 602 cases were assessed. Adherence levels were computed, which showed a mean adherence of 82.7% (range 44–100%). Three statistically significant trends in adherence were identified: two increasing trends in the ‘non-adherent’ compared with the ‘adherent’ group, and one decreasing trend. Discussion This study introduces a novel framework for continuously evaluating (non-)adherence to cancer guidelines. Future efforts should focus on the inclusion of health outcome measurements. Conclusion Through the integration of real-world data with a computer-interpretable guideline, we effectively calculated various facets of adherence to guidelines for EC.

Machine learning prediction of germline BRCA1/2 pathogenic variants in patients with ovarian cancer

Objectives To assess the performance of machine learning (ML) algorithms to predict the presence of germline BRCA1/2 pathogenic variants in ovarian cancer (OC) patients based on clinical–pathological features. Methods Clinical–pathological features of 648 patients with OC tested for BRCA1/2 were analysed using three supervised ML algorithms: random forest, boosting and support vector machine. Results In the ‘test’ sample, boosting proved to be the most effective algorithm (accuracy: 84.5%; precision: 80.0%; recall: 3.1%; area under the curve (AUC): 78.8%), followed by support vector machine (accuracy: 81.4%; precision: 72.7%; recall: 27.6%; AUC: 62.3%) and random forest (accuracy: 74.4%; precision: 55.6%; recall: 14.7%; AUC: 71.3%). In the ‘validation’ sample, accuracy was 79.8% for boosting, 81.7% for support vector machine, 80.8% for random forest. In the most effective algorithm (boosting), family history of OC showed the highest relative influence (52.9), followed by histotype (19.5), personal history of breast cancer (BC) (17.1), age at diagnosis (8.4) and family history of BC (2.2), while Federation of Gynecology and Obstetrics stage had no influence. Discussion We identified the predictive algorithm that best estimates the a priori likelihood of being a carrier of germline BRCA1/2 pathogenic variants in patients with OC. These findings support a role for ML approaches in predicting BRCA1/2 status in patients with OC, but accuracy and precision are still suboptimal for clinical use, suggesting the need for additional research. Conclusions Results support the selection of relevant clinical features for predictive purposes, which could have significant implications for the clinical management of patients with OC.

Early detection of female-specific cancers using longitudinal healthcare records with a multichannel convolutional neural network

Objectives Female-specific cancers, including breast, ovarian, cervical and uterine malignancies, lack comprehensive early detection approaches, particularly for ovarian and endometrial cancers where effective population-level screening remains limited. This study aimed to develop and validate a computational method for early detection of female-specific cancers using longitudinal healthcare records. Methods We developed a multichannel convolutional neural network (MCNN) to analyse 36-month pre-diagnostic healthcare records from Taiwan’s National Health Insurance Research Database. The study included 19 954 female patients (596 cancer cases, 19 358 controls) from 1999 to 2013. Log-likelihood ratio feature selection identified top 10 features across three data modalities (diagnostic codes, medications, medical orders). The six-channel architecture processed temporal patterns through stratified 10-fold cross-validation, with performance compared against nine baseline algorithms. Results MCNN achieved superior balanced performance with Macro-F₁ score of 0.8443, precision of 0.9135 and recall of 0.7978, outperforming traditional machine learning and deep learning approaches. Feature analysis revealed clinically relevant patterns including tamoxifen therapy, immunohistochemical procedures and cancer-specific diagnostic codes. SHapley Additive exPlanations (SHAP) interpretability analysis demonstrated the model’s ability to identify pre-diagnostic phases through temporal healthcare utilisation patterns. Systematic feature selection reduced computational requirements by over 99%, enabling validation on Taiwan’s population-scale National Health Insurance Research Database (NHIRD). Discussion The multichannel deep learning approach enables unified early detection across four female cancer types using routine administrative data, addressing detection gaps for ovarian and endometrial cancers while providing complementary risk stratification for existing screening programmes. Conclusion Clinical implementation through electronic health record (EHR) integration offers practical pathways for accessible cancer risk assessment during routine healthcare encounters.

BMJ Health & Care Informatics

Papers (3)

Publisher

ISSN

BMJ Health &amp; Care Informatics

Papers (3)

Publisher

ISSN

BMJ Health & Care Informatics