Investigator

Kris Lami

Nagasaki University

KLKris Lami
Papers(2)
Cytology Screening Us…Evaluation of general…
Collaborators(7)
Sompon ApornviratThiyaphat Laohawetwan…Andrey BychkovA.V. AsaturovaEthan N. OkoshiJunya FukuokaKei Tanaka
Institutions(4)
Nagasaki UniversityThammasat UniversityKameda Medical CenterNational Medical Rese…

Papers

Cytology Screening Using Z‐Stack Digital Slides: A Validation Study

ABSTRACTBackgroundFor applications of digital pathology in cytology, challenges such as focal precision and data volume remain. The goals of this validation study are to compare diagnostic accuracy, screening time, annotation counts, and inter‐ and intra‐observer agreement between digital slides using Z‐stack scanning (z‐WSI) and conventional glass slides in liquid‐based cervical cytology (LBC).MethodsWe collected 91 LBC samples, with an equal number of NILM, LSIL, HSIL, and SCC cases. Four cytotechnologists evaluated cases using glass slides and z‐WSI separately. They classified cases under two separate schemas: (1) “Screening‐2‐Category”: NILM (normal) vs. other lesions (ASC‐US and above); and (2) “Morpho‐3‐Category”: NILM vs. LSIL (mild dysplasia) vs. ASC‐H and higher (moderate dysplasia to squamous cell carcinoma) to reflect lesion severity and treatment implications.ResultsFor Screening‐2‐Category classifications, inter‐observer agreement was 0.685 for glass slides and 0.637 for z‐WSI, with intra‐observer agreement ranging from 82.4% to 95.6%. For Morpho‐3‐Category classifications, inter‐observer agreement was 0.700 for glass slides and 0.598 for z‐WSI, indicating reduced agreement with z‐WSI. Accuracy was 91.2% (glass slides) and 87.1% (z‐WSI) for Screening‐2‐Category, and 86.5% and 81.0% for Morpho‐3‐Category, with no significant differences. In both modalities, cytotechnologists tended to apply more annotations in true positive cases but fewer in false negative cases. Screening time for z‐WSI was 2–5 min longer on average for all cytotechnologists.Conclusionz‐WSI is not completely equivalent to glass slides, but it has the potential to be used as a tool for cytology screening. Training specifically designed for WSI is expected to enhance diagnostic accuracy and improve workflow efficiency.

Evaluation of general-purpose large language models as diagnostic support tools in cervical cytology

The application of general-purpose large language models (LLMs) in cytopathology remains largely unexplored. This study aims to evaluate the accuracy and consistency of a custom version of ChatGPT-4 (GPT), ChatGPT o3, and Gemini 2.5 Pro as diagnostic support tools for cervical cytology. A total of 200 Papanicolaou-stained cervical cytology images were acquired at 40x magnification, each measuring 384 × 384 pixels. These images consisted of 100 cases classified as negative for intraepithelial lesion or malignancy (NILM) and 100 cases across various abnormal categories: 20 low-grade squamous intraepithelial lesion (LSIL), 20 high-grade squamous intraepithelial lesion (HSIL), 20 squamous cell carcinoma (SCC), 20 adenocarcinoma in situ (AIS), and 20 adenocarcinoma (ADC). Diagnostic accuracy and consistency were evaluated by submitting each image to a GPT, ChatGPT o3, and Gemini 2.5 Pro 5-10 times. When distinguishing normal from abnormal cytology, LLMs showed mean sensitivity between 85.4 % and 100 %, and specificity between 67.2 % and 92.7 %. ChatGPT o3 was more accurate in identifying NILM (mean 89.2 % vs. 67.2 %) but less accurate in detecting LSIL (34 % vs. 85 %), HSIL (6 % vs. 63 %), and ADC (28 % vs. 91 %). Chain-of-thought prompting and submitting multiple images of the same diagnosis to ChatGPT o3 and Gemini 2.5 Pro did not significantly improve accuracy. Both models also performed poorly in identifying cervicovaginal infections. ChatGPT o3 and Gemini 2.5 Pro demonstrated complementary strengths in cervical cytology. Due to their low accuracy and inconsistency in abnormal cytology, general-purpose LLMs are not recommended as diagnostic support tools in cervical cytology.

2Papers
7Collaborators