Deciphering Metabolic Features to Target Neuroblastoma Using Machine Learning

R Wang^1,2,3*, Y Zhang^1,4, P Pachnis⁴, H Vu⁴, K Wang^1,3, R Deberardinis⁴, J Wang^1,3, (1) Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, TX. (2) School of Artificial Intelligence, Xidian University, Xi'an, People's Republic of China. (3) Medical Artificial Intelligence and Automation (MAIA) Lab, University of Texas Southwestern Medical Center, Dallas, TX. (4) Children's Research Institute, University of Texas Southwestern Medical Center, Dallas, TX.

R Wang

Presentations

(Wednesday, 7/15/2020) 3:30 PM - 4:30 PM [Eastern Time (GMT-4)]

Room: Track 2

Purpose: is the most common extracranial solid tumor in children with highly variable clinical behavior and outcome. The oncogene MYCN amplification is associated with younger age at diagnosis, aggressive disease course and worse survival. This study aims to: 1) develop a machine-learning-based pipeline which can comprehensively analyze metabolites to accurately predict the MYCN amplification status for risk stratification; and 2) decipher key metabolites for MYCN status prediction through developing a multi-classifier grouped ranking approach.

Methods: metabolomic dataset used in this study included 33 neuroblastoma cell lines. Targeted metabolomics identified 161 metabolites based on mass-over-charge ratio and retention time. The area under the mass spectrometry peaks normalized by protein quantification was used as a surrogate for relative abundance for each of metabolites. Among these cell lines, 22 had amplified status and 11 were non-amplified. The proposed pipeline for MYCN status prediction mainly consists of four key steps: SMOTE-based augmentation, recursive feature elimination based feature selection, training and testing. To decipher the key metabolites for MYCN status prediction, a multi-classifier grouped ranking (MCGR) approach was proposed, which consists of three phases: 1) joint feature selection; 2) performance quantification using multiple classifiers, including support vector machines (SVM), kernel extreme learning machine (kELM), and deep perceptron network (DNN); and 3) grouped ranking.

Results: obtained the best prediction results among three classifiers based on three-fold cross-validation, achieving the highest AUC of 0.89. The proposed MCGR approach identified twelve key metabolites that were most predictive for MYCN amplification status in neuroblastoma cell lines.

Conclusion: MCGR approach was developed as an unbiased, effective way to analyze metabolomics data, with potential to accelerate scientific discovery and enable clinical application of metabolomics.

Keywords

Feature Extraction, Feature Selection

Taxonomy

TH- Dataset Analysis/Biomathematics: Machine learning techniques

Contact Email

Are you sure ?

Deciphering Metabolic Features to Target Neuroblastoma Using Machine Learning

Presentations

Additional Links