Journal

Computerized Medical Imaging and Graphics

Papers (13)

Hierarchical pathology screening for cervical abnormality

Cervical smear screening is an imaging-based cancer detection tool, which is of pivotal importance for the early-stage diagnosis. A computer-aided screening system can automatically find out if the scanned whole-slide images (WSI) with cervical cells are classified as "abnormal" or "normal", and then alert pathologists. It can significantly reduce the workload for human experts, and is therefore highly demanded in clinical practice. Most of the screening methods are based on automatic cervical cell detection and classification, but the accuracy is generally limited due to the high variation of cell appearance and lacking context information from the surroundings. Here we propose a novel and hierarchical framework for automatic cervical smear screening aiming at the robust performance of case-level diagnosis and finding suspected "abnormal" cells. Our framework consists of three stages. We commence by extracting a large number of pathology images from the scanned WSIs, and implementing abnormal cell detection to each pathology image. Then, we feed the detected "abnormal" cells with corresponding confidence into our novel classification model for a comprehensive analysis of the extracted pathology images. Finally, we summarize the classification outputs of all extracted images, and determine the overall screening result for the target case. Experiments show that our three-stage hierarchical method can effectively suppress the errors from cell-level detection, and provide an effective and robust way for cervical abnormality screening.

Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer

Cervical cancer poses a major health threat to women globally. Optical coherence tomography (OCT) imaging has recently shown promise for non-invasive cervical lesion diagnosis. However, obtaining high-quality labeled cervical OCT images is challenging and time-consuming as they must correspond precisely with pathological results. The scarcity of such high-quality labeled data hinders the application of supervised deep-learning models in practical clinical settings. This study addresses the above challenge by proposing CMSwin, a novel self-supervised learning (SSL) framework combining masked image modeling (MIM) with contrastive learning based on the Swin-Transformer architecture to utilize abundant unlabeled cervical OCT images. In this contrastive-MIM framework, mixed image encoding is combined with a latent contextual regressor to solve the inconsistency problem between pre-training and fine-tuning and separate the encoder's feature extraction task from the decoder's reconstruction task, allowing the encoder to extract better image representations. Besides, contrastive losses at the patch and image levels are elaborately designed to leverage massive unlabeled data. We validated the superiority of CMSwin over the state-of-the-art SSL approaches with five-fold cross-validation on an OCT image dataset containing 1,452 patients from a multi-center clinical study in China, plus two external validation sets from top-ranked Chinese hospitals: the Huaxi dataset from the West China Hospital of Sichuan University and the Xiangya dataset from the Xiangya Second Hospital of Central South University. A human-machine comparison experiment on the Huaxi and Xiangya datasets for volume-level binary classification also indicates that CMSwin can match or exceed the average level of four skilled medical experts, especially in identifying high-risk cervical lesions. Our work has great potential to assist gynecologists in intelligently interpreting cervical OCT images in clinical settings. Additionally, the integrated GradCAM module of CMSwin enables cervical lesion visualization and interpretation, providing good interpretability for gynecologists to diagnose cervical diseases efficiently.

MFFUNet: A hybrid model with cross-attention-guided multi-feature fusion for automated segmentation of organs at risk in cervical cancer brachytherapy

Brachytherapy is a common treatment option for cervical cancer. An important step involved in brachytherapy is the delineation of organs at risk (OARs) based on computed tomography (CT) images. Automating OARs segmentation in brachytherapy has the benefit of both reducing the time and improving the quality of radiation therapy planning. This paper introduces a novel segmentation model named MFFUNet for the automatic contour delineation of OARs in cervical cancer brachytherapy. The proposed model employs a staged encoder-decoder structure, integrating the self-attention mechanism of Transformer with the CNN framework. A novel multi-features fusion (MFF) block with a cross-attention-guided feature fusion mechanism is also proposed, which efficiently extracts and cross-fuses features from multiple receptive fields, enriching the semantic information of the features and thus improving the performance of complex segmentation tasks. A private CT image dataset of 95 patients with cervical cancer undergoing brachytherapy is used to evaluate the segmentation performance of the proposed method. The OARs in the data consist of the bladder, rectum, and colon surrounding the cervix. The proposed model surpasses current mainstream OARs segmentation models in terms of segmentation accuracy. The mean Dice similarity coefficient (DSC) score of all three OARs has achieved 73.69%. Among them, the DSC score for the bladder is 92.65%, for the rectum is 66.55%, and for the colon is 61.86%. Moreover, we also conducted experiments on two common public thoracoabdominal multi-organ CT datasets. The excellent segmentation performance further demonstrates the generalization ability of our model. In conclusion, MFFUNet has demonstrated outstanding effectiveness in segmenting OARs for cervical cancer brachytherapy. By accurately delineating OARs, it enhances radiotherapy planning precision and helps reduce radiation toxicity, improving patient outcomes.

FreqYOLO: A uterine disease detection network based on local and global frequency feature learning

Leiomyomas (LM) and adenomyosis (AM) are common gynecological diseases with high incidence rates and an increasing trend of affecting younger women. Accurate detection and differentiation of LM and AM in ultrasound images are crucial for selecting appropriate treatment options. Due to the heterogeneity of these two diseases, the location, size, and number of lesions often vary significantly, posing substantial challenges for sonographers to conduct manual examinations. In this study, we propose a frequency feature learning-based detection method, FreqYOLO, for detecting LM and AM in ultrasound images. Specifically, in the dual-branch feature encoder, we introduce global and local frequency features. Subsequently, we apply a Fusion Neck to perform multi-scale fusion of the global and local features, enriching the frequency information. Finally, an improved anchor suppression method is employed to output the optimal detection anchors. The proposed FreqYOLO is compared with several state-of-the-art techniques, achieving a Recall of 0.734, Precision of 0.795, F1 score of 0.763, AP50 of 0.788, and mAP of 0.487. The results demonstrate that the FreqYOLO exhibits better detection performance of detecting and differentiating LM and AM.

MDAL: Modality-difference-based active learning for multimodal medical image analysis via contrastive learning and pointwise mutual information

Multimodal medical images reveal different characteristics of the same anatomy or lesion, offering significant clinical value. Deep learning has achieved widespread success in medical image analysis with large-scale labeled datasets. However, annotating medical images is expensive and labor-intensive for doctors, and the variations between different modalities further increase the annotation cost for multimodal images. This study aims to minimize the annotation cost for multimodal medical image analysis. We proposes a novel active learning framework MDAL based on modality differences for multimodal medical images. MDAL quantifies the sample-wise modality differences through pointwise mutual information estimated by multimodal contrastive learning. We hypothesize that samples with larger modality differences are more informative for annotation and further propose two sampling strategies based on these differences: MaxMD and DiverseMD. Moreover, MDAL could select informative samples in one shot without initial labeled data. We evaluated MDAL on public brain glioma and meningioma segmentation datasets and an in-house ovarian cancer classification dataset. MDAL outperforms other advanced active learning competitors. Besides, when using only 20%, 20%, and 15% of labeled samples in these datasets, MDAL reaches 99.6%, 99.9%, and 99.3% of the performance of supervised training with full labeled dataset, respectively. The results show that our proposed MDAL could significantly reduce the annotation cost for multimodal medical image analysis. We expect MDAL could be further extended to other multimodal medical data for lower annotation costs.

A multi-view contrastive learning and semi-supervised self-distillation framework for early recurrence prediction in ovarian cancer

This study presents a novel framework that integrates contrastive learning and knowledge distillation to improve early ovarian cancer (OC) recurrence prediction, addressing the challenges posed by limited labeled data and tumor heterogeneity. The research utilized CT imaging data from 585 OC patients, including 142 cases with complete follow-up information and 125 cases with unknown recurrence status. To pre-train the teacher network, 318 unlabeled images were sourced from public datasets (TCGA-OV and PLAGH-202-OC). Multi-view contrastive learning (MVCL) was employed to generate multi-view 2D tumor slices, enhancing the teacher network's ability to extract features from complex, heterogeneous tumors with high intra-class variability. Building on this foundation, the proposed semi-supervised multi-task self-distillation (Semi-MTSD) framework integrated OC subtyping as an auxiliary task using multi-task learning (MTL). This approach allowed the co-training of a student network for recurrence prediction, leveraging both labeled and unlabeled data to improve predictive performance in data-limited settings. The student network's performance was assessed using preoperative CT images with known recurrence outcomes. Evaluation metrics included area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), specificity (SPE), F1 score, floating-point operations (FLOPs), parameter count, training time, inference time, and mean corruption error (mCE). The proposed framework achieved an ACC of 0.862, an AUC of 0.916, a SPE of 0.895, and an F1 score of 0.831, surpassing existing methods for OC recurrence prediction. Comparative and ablation studies validated the model's robustness, particularly in scenarios characterized by data scarcity and tumor heterogeneity. The MVCL and Semi-MTSD framework demonstrates significant advancements in OC recurrence prediction, showcasing strong generalization capabilities in complex, data-constrained environments. This approach offers a promising pathway toward more personalized treatment strategies for OC patients.

Interpretable multi-stage attention network to predict cancer subtype, microsatellite instability, TP53 mutation and TMB of endometrial and colorectal cancer

Mismatch repair deficiency (dMMR), also known as high-grade microsatellite instability (MSI-H), is a well-established biomarker for predicting the immunotherapy response in endometrial cancer (EC) and colorectal cancer (CRC). Tumor mutational burden (TMB) has also emerged as an important quantitative genomic biomarker for assessing the efficacy of immune checkpoint inhibitors. Although next-generation sequencing (NGS) can be used to assess MSI and TMB, the high costs, low sample throughput, and significant DNA requirements make NGS impractical for routine clinical screening. In this study, an interpretable, multi-stage attention deep learning (DL) network is introduced to predict pathological subtypes, MSI, TP53 mutations, and TMB directly from low-cost, routinely used histopathological whole slide images of EC and CRC slides. Experimental results showed that this method consistently outperformed seven state-of-the-art approaches in cancer subtyping and molecular status prediction across EC and CRC datasets. Fisher's Least Significant Difference test confirmed a strong correlation between model predictions and actual molecular statuses (MSI, TP53, and TMB) (p<0.001). Furthermore, Kaplan-Meier disease-free survival analysis revealed that CRC patients with model-predicted high TMB had significantly longer disease-free survival than those with low TMB (p<0.05). These findings demonstrate that the proposed DL-based approach holds significant potential for directly predicting immunotherapy-related pathological diagnoses and molecular statuses from routine WSIs, supporting personalized cancer immunotherapy treatment decisions in EC and CRC.

Interstitial-guided automatic clinical tumor volume segmentation network for cervical cancer brachytherapy

Automatic clinical tumor volume (CTV) delineation is pivotal to improving outcomes for interstitial brachytherapy cervical cancer. However, the prominent differences in gray values due to the interstitial needles bring great challenges on deep learning-based segmentation model. In this study, we proposed a novel interstitial-guided segmentation network termed advance reverse guided network (ARGNet) for cervical tumor segmentation with interstitial brachytherapy. Firstly, the location information of interstitial needles was integrated into the deep learning framework via multi-task by a cross-stitch way to share encoder feature learning. Secondly, a spatial reverse attention mechanism is introduced to mitigate the distraction characteristic of needles on tumor segmentation. Furthermore, an uncertainty area module is embedded between the skip connections and the encoder of the tumor segmentation task, which is to enhance the model's capability in discerning ambiguous boundaries between the tumor and the surrounding tissue. Comprehensive experiments were conducted retrospectively on 191 CT scans under multi-course interstitial brachytherapy. The experiment results demonstrated that the characteristics of interstitial needles play a role in enhancing the segmentation, achieving the state-of-the-art performance, which is anticipated to be beneficial in radiotherapy planning.

Cell comparative learning: A cervical cytopathology whole slide image classification method using normal and abnormal cells

Automated cervical cancer screening through computer-assisted diagnosis has shown considerable potential to improve screening accessibility and reduce associated costs and errors. However, classification performance on whole slide images (WSIs) remains suboptimal due to patient-specific variations. To improve the precision of the screening, pathologists not only analyze the characteristics of suspected abnormal cells, but also compare them with normal cells. Motivated by this practice, we propose a novel cervical cell comparative learning method that leverages pathologist knowledge to learn the differences between normal and suspected abnormal cells within the same WSI. Our method employs two pre-trained YOLOX models to detect suspected abnormal and normal cells in a given WSI. A self-supervised model then extracts features for the detected cells. Subsequently, a tailored Transformer encoder fuses the cell features to obtain WSI instance embeddings. Finally, attention-based multi-instance learning is applied to achieve classification. The experimental results show an AUC of 0.9319 for our proposed method. Moreover, the method achieved professional pathologist-level performance, indicating its potential for clinical applications.

Multi-task network for automated analysis of high-resolution endomicroscopy images to detect cervical precancer and cancer

Cervical cancer is a public health emergency in low- and middle-income countries where resource limitations hamper standard-of-care prevention strategies. The high-resolution endomicroscope (HRME) is a low-cost, point-of-care device with which care providers can image the nuclear morphology of cervical lesions. Here, we propose a deep learning framework to diagnose cervical intraepithelial neoplasia grade 2 or more severe from HRME images. The proposed multi-task convolutional neural network uses nuclear segmentation to learn a diagnostically relevant representation. Nuclear segmentation was trained via proxy labels to circumvent the need for expensive, manually annotated nuclear masks. A dataset of images from over 1600 patients was used to train, validate, and test our algorithm; data from 20% of patients were reserved for testing. An external evaluation set with images from 508 patients was used to further validate our findings. The proposed method consistently outperformed other state-of-the art architectures achieving a test per patient area under the receiver operating characteristic curve (AUC-ROC) of 0.87. Performance was comparable to expert colposcopy with a test sensitivity and specificity of 0.94 (p = 0.3) and 0.58 (p = 1.0), respectively. Patients with recurrent human papillomavirus (HPV) infections are at a higher risk of developing cervical cancer. Thus, we sought to incorporate HPV DNA test results as a feature to inform prediction. We found that incorporating patient HPV status improved test specificity to 0.71 at a sensitivity of 0.94.

Interpretable attention-based deep learning ensemble for personalized ovarian cancer treatment without manual annotations

Inhibition of pathological angiogenesis has become one of the first FDA approved targeted therapies widely tested in anti-cancer treatment, i.e. VEGF-targeting monoclonal antibody bevacizumab, in combination with chemotherapy for frontline and maintenance therapy for women with newly diagnosed ovarian cancer. Identification of the best predictive biomarkers of bevacizumab response is necessary in order to select patients most likely to benefit from this therapy. Hence, this study investigates the protein expression patterns on immunohistochemical whole slide images of three angiogenesis related proteins, including Vascular endothelial growth factor, Angiopoietin 2 and Pyruvate kinase isoform M2, and develops an interpretable and annotation-free attention based deep learning ensemble framework to predict the bevacizumab therapeutic effect on patients with epithelial ovarian cancer or peritoneal serous papillary carcinoma using tissue microarrays (TMAs). In evaluation with five-fold cross validation, the proposed ensemble model using the protein expressions of both Pyruvate kinase isoform M2 and Angiopoietin 2 achieves a notably high F-score (0.99±0.02), accuracy (0.99±0.03), precision (0.99±0.02), recall (0.99±0.02) and AUC (1.00±0). Kaplan-Meier progression free survival analysis confirms that the proposed ensemble is able to identify patients in the predictive therapeutic sensitive group with low cancer recurrence (p<0.001), and the Cox proportional hazards model analysis further confirms the above statement (p=0.012). In conclusion, the experimental results demonstrate that the proposed ensemble model using the protein expressions of both Pyruvate kinase isoform M2 and Angiopoietin 2 can assist treatment planning of bevacizumab targeted therapy for patients with ovarian cancer.

QUIZ: An arbitrary volumetric point matching method for medical image registration

Rigid pre-registration involving local-global matching or other large deformation scenarios is crucial. Current popular methods rely on unsupervised learning based on grayscale similarity, but under circumstances where different poses lead to varying tissue structures, or where image quality is poor, these methods tend to exhibit instability and inaccuracies. In this study, we propose a novel method for medical image registration based on arbitrary voxel point of interest matching, called query point quizzer (QUIZ). QUIZ focuses on the correspondence between local-global matching points, specifically employing CNN for feature extraction and utilizing the Transformer architecture for global point matching queries, followed by applying average displacement for local image rigid transformation.We have validated this approach on a large deformation dataset of cervical cancer patients, with results indicating substantially smaller deviations compared to state-of-the-art methods. Remarkably, even for cross-modality subjects, it achieves results surpassing the current state-of-the-art.

Weakly supervised deep learning for prediction of treatment effectiveness on ovarian cancer from histopathology images

Despite the progress made during the last two decades in the surgery and chemotherapy of ovarian cancer, more than 70 % of advanced patients are with recurrent cancer and decease. Surgical debulking of tumors following chemotherapy is the conventional treatment for advanced carcinoma, but patients with such treatment remain at great risk for recurrence and developing drug resistance, and only about 30 % of the women affected will be cured. Bevacizumab is a humanized monoclonal antibody, which blocks VEGF signaling in cancer, inhibits angiogenesis and causes tumor shrinkage, and has been recently approved by FDA as a monotherapy for advanced ovarian cancer in combination with chemotherapy. Considering the cost, potential toxicity, and finding that only a portion of patients will benefit from these drugs, the identification of new predictive method for the treatment of ovarian cancer remains an urgent unmet medical need. In this study, we develop weakly supervised deep learning approaches to accurately predict therapeutic effect for bevacizumab of ovarian cancer patients from histopathological hematoxylin and eosin stained whole slide images, without any pathologist-provided locally annotated regions. To the authors' best knowledge, this is the first model demonstrated to be effective for prediction of the therapeutic effect of patients with epithelial ovarian cancer to bevacizumab. Quantitative evaluation of a whole section dataset shows that the proposed method achieves high accuracy, 0.882 ± 0.06; precision, 0.921 ± 0.04, recall, 0.912 ± 0.03; F-measure, 0.917 ± 0.07 using 5-fold cross validation and outperforms two state-of-the art deep learning approaches Coudray et al. (2018), Campanella et al. (2019). For an independent TMA testing set, the three proposed methods obtain promising results with high recall (sensitivity) 0.946, 0.893 and 0.964, respectively. The results suggest that the proposed method could be useful for guiding treatment by assisting in filtering out patients without positive therapeutic response to suffer from further treatments while keeping patients with positive response in the treatment process. Furthermore, according to the statistical analysis of the Cox Proportional Hazards Model, patients who were predicted to be invalid by the proposed model had a very high risk of cancer recurrence (hazard ratio = 13.727) than patients predicted to be effective with statistical signifcance (p < 0.05).

Publisher

Elsevier BV

ISSN

0895-6111