Investigator

Nguyen Quoc Khanh Le

Associate Professor · Taipei Medical University

About

NQKNguyen Quoc Khanh…
Papers(2)
Toward clinical trans…Interpretable Machine…
Institutions(1)
Taipei Medical Univer…

Papers

Interpretable Machine Learning for Proteomics‐Based Subtyping and Tumor Mutational Burden Prediction in Endometrial Cancer

ABSTRACT Background Endometrial carcinoma (EC) represents a significant clinical challenge due to its pronounced molecular heterogeneity, directly influencing prognosis and therapeutic responses. Accurate classification of molecular subtypes (CNV‐high, CNV‐low, MSI‐H, POLE) and precise tumor mutational burden (TMB) assessment is crucial for guiding personalized therapeutic interventions. Integrating proteomics data with advanced machine learning (ML) techniques offers a promising strategy for achieving precise, clinically actionable classification and biomarker discovery in EC. Materials and Methods Using proteomic data from 95 EC patients (83 endometrioid, 12 serous), sourced from the Clinical Proteomic Tumor Analysis Consortium (CPTAC), we developed an ML pipeline integrating proteomic feature selection (Lasso‐penalized logistic regression), classification modeling, and interpretability analysis. The dataset was divided into training (70%) and test (30%) sets, with synthetic minority oversampling (SMOTE) applied to address the class imbalance. Logistic regression models were trained for molecular subtypes classification, and the TMB prediction model performance was evaluated using accuracy, AUC, precision, recall, and F1‐score. Model interpretability was enhanced using explainable AI (XAI) techniques: SHapley Additive exPlanations (SHAP) and Local Interpretable Model‐agnostic Explanations (LIME). Results Feature selection reduced the proteomic dataset from 11,000 to eight key proteins. The proteomics‐based ML model demonstrated robust predictive performance, accurately classifying EC molecular subtypes (accuracy: 82.8%; AUC: 0.990) and distinguishing high (≥10 mutations/Mb) versus low TMB (<10 mutations/Mb) cases (accuracy: 89.7%; AUC: 0.984). SHAP analysis highlighted clinically recognized biomarkers (MLH1, PMS2, STAT1) and identified novel protein candidates (MTHFD2, MAST4, RPL22L1, MX2, SEC16A). LIME analysis provided individualized prediction interpretations, clarifying each protein biomarker's influence on model decisions. Conclusion Our proteomics‐driven ML approach demonstrates high accuracy and interpretability in EC subtype classification and TMB prediction. By identifying validated and novel biomarkers, this strategy provides essential biological insights and a strong foundation for the future development of non‐invasive diagnostics, personalized treatments, and precision medicine in EC.

164Works
2Papers

Positions

2023–

Associate Professor

Taipei Medical University

2019–

Assistant Professor

Taipei Medical University · Professional Master Program in Artificial Intelligence in Medicine

2018–

Research Fellow

Nanyang Technological University · School of Humanities

2012–

Research Scholar

Yuan Ze University · Department of Computer Science and Engineering

2010–

Software Engineering

VNG Corporation

Education

2018

Ph.D

Yuan Ze University · Computer Science and Engineering

2014

Master

Yuan Ze University · Computer Science and Engineering

2010

Bachelor

University of DaLat · Information Technology

Country

TW

Keywords
radiomicsbioinformaticsartificial intelligencemedical imagingdata analysis
Links & IDs
0000-0003-4896-7926Institution sitePersonal site

Scopus: 57208281644

Researcher Id: H-2057-2017