Room: Karl Dean Ballroom C
Purpose: To compare the effectiveness of pre-treatment features and delta-features computed at different time points on assessing brain radiosurgery treatment response, and to investigate the performance of different combinations of machine learning feature selection methods and machine learning classification methods.
Methods: The pre-treatment, one-week post-treatment, and two-month post-treatment T1-weighted and T2-weighted FLAIR MR images of 12 brain tumor patients treated by radiosurgery were acquired. 61 radiomic features were extracted from the GTV of each image, and the delta-features from pre-treatment to two post-treatment time points were calculated. With leave-one-out cross-validation, pre-treatment features and the two sets of delta-features were separately input into a univariate Cox regression model and a machine learning model (L1-regularized logistic regression [L1-LR], Out-of-bag permutation random forest [RF] or neural network [NN]) for feature selection. Overall survival was predicted by the selected features with a machine learning method (L1-LR, L2-LR, RF, NN, Kernel-SVM, Linear-SVM, or naÃ¯ve bayes [NB]). The predictive performance of each model combination and feature type was estimated by the area under receiver operating characteristic (ROC) curve (AUC).
Results: The AUC of one-week delta-features was significantly higher than that of pre-treatment features (p-value= 1.29e-10) and two-month delta-features (p-value= 6.57e-06). The model combinations of L1-LR feature selection and RF classification and RF feature selection and NB classification based on one-week delta features presented the highest AUC value (both AUC=0.944).
Conclusion: This work indicates that delta-features could be with higher predictive value for assessing treatment response than pre-treatment features. The time point of computing the delta-features is a vital factor in building models. Analyzing delta-features by a suitable machine learning approach is potentially a powerful tool for assessing treatment response. A larger dataset would be warranted to validate the results in this study.