Journal

BMC Medical Research Methodology

Papers (8)

The impact of life tables on age standardized net survival of real-life example databases

Abstract Background Population-based, age-standardized net survival estimates provide valuable insights for comparing the effectiveness of cancer treatment and the prospects of cure in an international context. Although numerous studies have previously assessed survival, the choice of life tables may crucially impact the feasibility of such analyses. Therefore, based on available studies, our aim was to understand the critical influence of life tables on net survival estimates. Methods Record-level data of approximately 50,000 breast, cervical, and ovarian cancer patients were extracted from the Hungarian National Cancer Registry. These patients were diagnosed between 2010 and 2014 and were followed up until December 31, 2019. Life tables for the Hungarian female population were taken from the Human Mortality Database, the Human Life-Table Database and were compiled according to the EUROCARE, CONCORD both multivariable flexible and Ewbank methodology. Regarding the last due to the lack of specific parameters, simulations were performed to assess the missing values. The calculation of 5-year age-standardized net survival using different life tables revealed limitations in the methodology, highlighting the impact of life table selection on survival estimates. Findings Minor biases were observed in age-standardized net survival when using life tables from different international databases. However, the net survival of breast cancer, which had the most favorable prognosis of the studied malignancies, showed significant discrepancies. Moreover, this research highlights the extreme sensitivity of the applied κ parameter in the CONCORD Ewbank method, underscoring the need for careful consideration when applying this approach. Interpretation Present study shed light on how the choice of life tables can lead to differences in survival estimates for the same cancer population. It also emphasizes the importance of open methodological discussions to improve validity and accuracy of international comparability.

Multiclass risk models for ovarian malignancy: an illustration of prediction uncertainty due to the choice of algorithm

Abstract Background Assessing malignancy risk is important to choose appropriate management of ovarian tumors. We compared six algorithms to estimate the probabilities that an ovarian tumor is benign, borderline malignant, stage I primary invasive, stage II-IV primary invasive, or secondary metastatic. Methods This retrospective cohort study used 5909 patients recruited from 1999 to 2012 for model development, and 3199 patients recruited from 2012 to 2015 for model validation. Patients were recruited at oncology referral or general centers and underwent an ultrasound examination and surgery ≤ 120 days later. We developed models using standard multinomial logistic regression (MLR), Ridge MLR, random forest (RF), XGBoost, neural networks (NN), and support vector machines (SVM). We used nine clinical and ultrasound predictors but developed models with or without CA125. Results Most tumors were benign (3980 in development and 1688 in validation data), secondary metastatic tumors were least common (246 and 172). The c-statistic (AUROC) to discriminate benign from any type of malignant tumor ranged from 0.89 to 0.92 for models with CA125, from 0.89 to 0.91 for models without. The multiclass c-statistic ranged from 0.41 (SVM) to 0.55 (XGBoost) for models with CA125, and from 0.42 (SVM) to 0.51 (standard MLR) for models without. Multiclass calibration was best for RF and XGBoost. Estimated probabilities for a benign tumor in the same patient often differed by more than 0.2 (20% points) depending on the model. Net Benefit for diagnosing malignancy was similar for algorithms at the commonly used 10% risk threshold, but was slightly higher for RF at higher thresholds. Comparing models, between 3% (XGBoost vs. NN, with CA125) and 30% (NN vs. SVM, without CA125) of patients fell on opposite sides of the 10% threshold. Conclusion Although several models had similarly good performance, individual probability estimates varied substantially.

Weibull parametric model for survival analysis in women with endometrial cancer using clinical and T2-weighted MRI radiomic features

AbstractBackgroundSemiparametric survival analysis such as the Cox proportional hazards (CPH) regression model is commonly employed in endometrial cancer (EC) study. Although this method does not need to know the baseline hazard function, it cannot estimate event time ratio (ETR) which measures relative increase or decrease in survival time. To estimate ETR, the Weibull parametric model needs to be applied. The objective of this study is to develop and evaluate the Weibull parametric model for EC patients’ survival analysis.MethodsTraining (n = 411) and testing (n = 80) datasets from EC patients were retrospectively collected to investigate this problem. To determine the optimal CPH model from the training dataset, a bi-level model selection with minimax concave penalty was applied to select clinical and radiomic features which were obtained from T2-weighted MRI images. After the CPH model was built, model diagnostic was carried out to evaluate the proportional hazard assumption with Schoenfeld test. Survival data were fitted into a Weibull model and hazard ratio (HR) and ETR were calculated from the model. Brier score and time-dependent area under the receiver operating characteristic curve (AUC) were compared between CPH and Weibull models. Goodness of the fit was measured with Kolmogorov-Smirnov (KS) statistic.ResultsAlthough the proportional hazard assumption holds for fitting EC survival data, the linearity of the model assumption is suspicious as there are trends in the age and cancer grade predictors. The result also showed that there was a significant relation between the EC survival data and the Weibull distribution. Finally, it showed that Weibull model has a larger AUC value than CPH model in general, and it also has smaller Brier score value for EC survival prediction using both training and testing datasets, suggesting that it is more accurate to use the Weibull model for EC survival analysis.ConclusionsThe Weibull parametric model for EC survival analysis allows simultaneous characterization of the treatment effect in terms of the hazard ratio and the event time ratio (ETR), which is likely to be better understood. This method can be extended to study progression free survival and disease specific survival.Trial registrationClinicalTrials.gov NCT03543215,https://clinicaltrials.gov/, date of registration: 30th June 2017.

Validity and reliability of State-Trait Anxiety Inventory in Danish women aged 45 years and older with abnormal cervical screening results

Abstract Background State Trait Anxiety Inventory (STAI) scale was developed in the 1980’s and has been widely used both in clinical settings and in research. However the Danish version of STAI has not been validated. The aim of this study was to assess the validity and reliability of STAI - state anxiety scale in Danish women aged 45 years and older with abnormal cervical cancer screening results. Methods Women ≥45 years referred with an abnormal cervical cytology and healthy volunteers (n = 12) underwent cognitive interview after completing STAI. Further, STAI was sent out in an electronic questionnaire to women (n = 109) seen at the gynecological department with abnormal cervical cancer screening test during 2018. Validity and reliability of STAI was evaluated according to the Consensus-based Standards for the selection of health Measurement Instruments (COSMIN) checklist by examining internal consistency, test-retest reliability, measurement error, floor and ceiling, construct validity and content validity. Results In the cognitive interviews the content validity was evaluated to be very good. The internal consistency of the scale was excellent with Cronbach’s α = 0.93. Test-retest reliability was good with an intra-class correlation coefficient of 0.80 and the systematic difference between test-retest results was negligible. The construct validity was good. Conclusion To our best knowledge, this is the first validation study of the Danish translation of STAI-state anxiety scale. This version of STAI demonstrates an acceptable reliability and validity when used in a gynecological setting.

Weighted metrics are required when evaluating the performance of prediction models in nested case–control studies

Abstract Background Nested case–control (NCC) designs are efficient for developing and validating prediction models that use expensive or difficult-to-obtain predictors, especially when the outcome is rare. Previous research has focused on how to develop prediction models in this sampling design, but little attention has been given to model validation in this context. We therefore aimed to systematically characterize the key elements for the correct evaluation of the performance of prediction models in NCC data. Methods We proposed how to correctly evaluate prediction models in NCC data, by adjusting performance metrics with sampling weights to account for the NCC sampling. We included in this study the C-index, threshold-based metrics, Observed-to-expected events ratio (O/E ratio), calibration slope, and decision curve analysis. We illustrated the proposed metrics with a validation of the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA version 5) in data from the population-based Rotterdam study. We compared the metrics obtained in the full cohort with those obtained in NCC datasets sampled from the Rotterdam study, with and without a matched design. Results Performance metrics without weight adjustment were biased: the unweighted C-index in NCC datasets was 0.61 (0.58–0.63) for the unmatched design, while the C-index in the full cohort and the weighted C-index in the NCC datasets were similar: 0.65 (0.62–0.69) and 0.65 (0.61–0.69), respectively. The unweighted O/E ratio was 18.38 (17.67–19.06) in the NCC datasets, while it was 1.69 (1.42–1.93) in the full cohort and its weighted version in the NCC datasets was 1.68 (1.53–1.84). Similarly, weighted adjustments of threshold-based metrics and net benefit for decision curves were unbiased estimates of the corresponding metrics in the full cohort, while the corresponding unweighted metrics were biased. In the matched design, the bias of the unweighted metrics was larger, but it could also be compensated by the weight adjustment. Conclusions Nested case–control studies are an efficient solution for evaluating the performance of prediction models that use expensive or difficult-to-obtain biomarkers, especially when the outcome is rare, but the performance metrics need to be adjusted to the sampling procedure.

Assessing treatment effects with adjusted restricted mean time lost in observational competing risks data

According to long-term follow-up data of malignant tumor patients, assessing treatment effects requires careful consideration of competing risks. The commonly used cause-specific hazard ratio (CHR) and sub-distribution hazard ratio (SHR) are relative indicators and may present challenges in terms of proportional hazards assumption and clinical interpretation. Recently, the restricted mean time lost (RMTL) has been recommended as a supplementary measure for better clinical interpretation. Moreover, for observational study data in epidemiological and clinical settings, due to the influence of confounding factors, covariate adjustment is crucial for determining the causal effect of treatment. We construct an RMTL estimator after adjusting for covariates based on the inverse probability weighting method, and derive the variance to construct interval estimates based on the large sample properties. We use simulation studies to study the statistical performance of this estimator in various scenarios. In addition, we further consider the changes in treatment effects over time, constructing a dynamic RMTL difference curve and corresponding confidence bands for the curve. The simulation results demonstrate that the adjusted RMTL estimator exhibits smaller biases compared with unadjusted RMTL and provides robust interval estimates in all scenarios. This method was applied to a real-world cervical cancer patient data, revealing improvements in the prognosis of patients with small cell carcinoma of the cervix. The results showed that the protective effect of surgery was significant only in the first 20 months, but the long-term effect was not obvious. Radiotherapy significantly improved patient outcomes during the follow-up period from 17 to 57 months, while radiotherapy combined with chemotherapy significantly improved patient outcomes throughout the entire period. We propose the approach that is easy to interpret and implement for assessing treatment effects in observational competing risk data.

Geographically weighted accelerated failure time model for spatial survival data: application to ovarian cancer survival data in New Jersey

In large multiregional cohort studies, survival data is often collected at small geographical levels (such as counties) and aggregated at larger levels, leading to correlated patterns that are associated with location. Traditional studies typically analyze such data globally or locally by region, often neglecting the spatial information inherent in the data, which can introduce bias in effect estimates and potentially reduce statistical power. We propose a Geographically Weighted Accelerated Failure Time Model for spatial survival data to investigate spatial heterogeneity. We establish a weighting scheme and bandwidth selection based on quasi-likelihood information criteria. Theoretical properties of the proposed estimators are thoroughly examined. To demonstrate the efficacy of the model in various scenarios, we conduct a simulation study with different sample sizes and adherence to the proportional hazards assumption or not. Additionally, we apply the proposed method to analyze ovarian cancer survival data from the Surveillance, Epidemiology, and End Results cancer registry in the state of New Jersey. Our simulation results indicate that the proposed model exhibits superior performance in terms of four measurements compared to existing methods, including the geographically weighted Cox model, when the proportional hazards assumption is violated. Furthermore, in scenarios where the sample size per location is 20-25, the simulation data failed to fit the local model, while our proposed model still demonstrates satisfactory performance. In the empirical study, we identify clear spatial variations in the effects of all three covariates. Our proposed model offers a novel approach to exploring spatial heterogeneity of survival data compared to global and local models, providing an alternative to geographically weighted Cox regression when the proportional hazards assumption is not met. It addresses the issue of certain counties' survival data being unable to fit the model due to limited samples, particularly in the context of rare diseases.

Improving postal survey response using behavioural science: a nested randomised control trial

Abstract Background Systematic reviews have identified effective strategies for increasing postal response rates to questionnaires; however, most studies have isolated single techniques, testing the effect of each one individually. Despite providing insight into explanatory mechanisms, this approach lacks ecological validity, given that multiple techniques are often combined in routine practice. Methods We used a two-armed parallel randomised controlled trial ( n = 2702), nested within a cross-sectional health survey study, to evaluate whether using a pragmatic combination of behavioural science and evidenced-based techniques (e.g., personalisation, social norms messaging) in a study invitation letter increased response to the survey, when compared with a standard invitation letter. Participants and outcome assessors were blinded to group assignment. We tested this in a sample of women testing positive for human papillomavirus (HPV) at cervical cancer screening in England. Results Overall, 646 participants responded to the survey (response rate [RR] = 23.9%). Logistic regression revealed higher odds of response in the intervention arm ( n = 357/1353, RR = 26.4%) compared with the control arm ( n = 289/1349, RR = 21.4%), while adjusting for age, deprivation, clinical site, and clinical test result (aOR = 1.30, 95% CI: 1.09–1.55). Conclusion Applying easy-to-implement behavioural science and evidence-based methods to routine invitation letters improved postal response to a health-related survey, whilst adjusting for demographic characteristics. Our findings provide support for the pragmatic adoption of combined techniques in routine research to increase response to postal surveys. Trial registration ISRCTN, ISRCTN15113095 . Registered 7 May 2019 – retrospectively registered.

Publisher

Springer Science and Business Media LLC

ISSN

1471-2288