To investigate the interobserver agreement of the International Ovarian Tumour Analysis (IOTA) ultrasound‐based simple rules risk (SRRisk) score, the logistic regression model 2 (LR2), the Assessment of Different NEoplasias in the adneXa (ADNEX) model and the Ovarian‐Adnexal Reporting and Data System (O‐RADS) in an Australian, population‐based context.
A retrospective multi‐centre study was performed between January 2020 and January 2021. The study included 198 women with adnexal masses examined with transvaginal grey scale and power Doppler ultrasound. Participants were recruited from the multidisciplinary oncology meetings (MDT) of two tertiary cancer centres. Two independent radiologists described the adnexal masses according to the SRR, LR2 scores, ADNEX model, and O‐RADS. Values > 30 units different were considered differential and > 50 units were considered highly differential.
From 198 patients, 128 were diagnosed with benign ovarian masses, 53 with malignant and 17 patients with borderline tumours. There was strong agreement (Cohen's kappa 0.8) for intra‐tumour blood flow, number of cysts locules, and presence of blood flow within solid projections. Interobserver agreement was moderate (Cohen's kappa 0.60–0.79) for the presence of free pelvic fluid/ascites, solid components, unilocular cysts and acoustic shadows. Of the 198 cases, 10 (5%) cases were highly differential and (38/198) 19% were differential for SRRisk, (20/198) 10% highly differential and (36/198) 18% differential for LR2, and (10/198) 5% and (24/198) 12% for ADNEXA model, respectively. Comparison of O‐RADS scores between the two observers showed a moderate agreement with a kappa of 0.65. In 7/198 (4%) cases, the difference between observers was for 2 or more categories when using the O‐RADS score.
Our results suggested that interobserver variation was present in evaluating adnexal masses using well established ultrasonographic diagnostic models. Implementation of sonographic ovarian cancer risk prediction models will need to consider this issue and ensure examiners have adequate training in the technique, and standard operating procedures are in place to reduce interobserver variability.