Investigator

Olatomiwa O. Bifarin

Georgia Institute Of Technology

OOBOlatomiwa O. Bifa…
Papers(3)
Automated Machine Lea…Serum Lipidome Profil…Machine Learning Reve…
Collaborators(10)
Facundo M. FernándezSamyukta SahSun Young KwonChi-Heum ChoDavid A. GaulHanbyoul ChoHyewon ChungJae Hoon KimJaeyeon KimSamuel G. Moore
Institutions(4)
Georgia Institute Of …Keimyung University S…Yonsei University Col…Indiana University Me…

Papers

Automated Machine Learning and Explainable AI (AutoML-XAI) for Metabolomics: Improving Cancer Diagnostics

Metabolomics generates complex data necessitating advanced computational methods for generating biological insight. While machine learning (ML) is promising, the challenges of selecting the best algorithms and tuning hyperparameters, particularly for nonexperts, remain. Automated machine learning (AutoML) can streamline this process; however, the issue of interpretability could persist. This research introduces a unified pipeline that combines AutoML with explainable AI (XAI) techniques to optimize metabolomics analysis. We tested our approach on two data sets: renal cell carcinoma (RCC) urine metabolomics and ovarian cancer (OC) serum metabolomics. AutoML, using Auto-sklearn, surpassed standalone ML algorithms like SVM and k-Nearest Neighbors in differentiating between RCC and healthy controls, as well as OC patients and those with other gynecological cancers. The effectiveness of Auto-sklearn is highlighted by its AUC scores of 0.97 for RCC and 0.85 for OC, obtained from the unseen test sets. Importantly, on most of the metrics considered, Auto-sklearn demonstrated a better classification performance, leveraging a mix of algorithms and ensemble techniques. Shapley Additive Explanations (SHAP) provided a global ranking of feature importance, identifying dibutylamine and ganglioside GM(d34:1) as the top discriminative metabolites for RCC and OC, respectively. Waterfall plots offered local explanations by illustrating the influence of each metabolite on individual predictions. Dependence plots spotlighted metabolite interactions, such as the connection between hippuric acid and one of its derivatives in RCC, and between GM3(d34:1) and GM3(18:1_16:0) in OC, hinting at potential mechanistic relationships. Through decision plots, a detailed error analysis was conducted, contrasting feature importance for correctly versus incorrectly classified samples. In essence, our pipeline emphasizes the importance of harmonizing AutoML and XAI, facilitating both simplified ML application and improved interpretability in metabolomics data science.

Serum Lipidome Profiling Reveals a Distinct Signature of Ovarian Cancer in Korean Women

Abstract Background: Distinguishing ovarian cancer from other gynecological malignancies is crucial for patient survival yet hindered by non-specific symptoms and limited understanding of ovarian cancer pathogenesis. Accumulating evidence suggests a link between ovarian cancer and deregulated lipid metabolism. Most studies have small sample sizes, especially for early-stage cases, and lack racial/ethnic diversity, necessitating more inclusive research for improved ovarian cancer diagnosis and prevention. Methods: Here, we profiled the serum lipidome of 208 ovarian cancer, including 93 early-stage patients with ovarian cancer and 117 nonovarian cancer (other gynecological malignancies) patients of Korean descent. Serum samples were analyzed with a high-coverage liquid chromatography high-resolution mass spectrometry platform, and lipidome alterations were investigated via statistical and machine learning (ML) approaches. Results: We found that lipidome alterations unique to ovarian cancer were present in Korean women as early as when the cancer is localized, and those changes increase in magnitude as the diseases progresses. Analysis of relative lipid abundances revealed specific patterns for various lipid classes, with most classes showing decreased abundance in ovarian cancer in comparison with other gynecological diseases. ML methods selected a panel of 17 lipids that discriminated ovarian cancer from nonovarian cancer cases with an AUC value of 0.85 for an independent test set. Conclusions: This study provides a systemic analysis of lipidome alterations in human ovarian cancer, specifically in Korean women. Impact: Here, we show the potential of circulating lipids in distinguishing ovarian cancer from nonovarian cancer conditions.

3Papers
11Collaborators