Purpose: To evaluate the performance of machine learning algorithms for DVH predictions. Sixty-nine head-and-neck cases are used to evaluate the algorithms.
Methods: We survey commonly used supervised machine learning algorithms, namely linear regression (LR), pruned and unpruned decision trees (DT), support vector machines (SVM), gradient boosted trees (GBT), and random forests (RF), for the prediction of spinal cord DVH point Dâ‚‚. SVM included tuned parameters gamma = 0.04 and C = 1, RF was developed using the tuned number of predictors sampled for splitting as two and 9000 trees, and GBT was developed using a learning rate = 0.001 and 9000 trees. All six models were developed using the same predictors, which parameterized the patientâ€™s anatomy and tumor dose distributions. Specifically, volume, surface area, cross-sectional area, and spread (maximum distance in x, y, z dimensions) was computed for the tumor and cord. Minimum distance and center of mass distance between the tumor and cord were computed. Additionally, tumor DVH points Dâ‚‚, Dâ‚‚â‚€, Dâ‚„â‚€, Dâ‚†â‚€, Dâ‚ˆâ‚€, and Dâ‚‰â‚ˆ were included in modeling. Fifty randomly selected patients were used to train the models and remaining 19 patients were used for testing.
Results: Mean squared error (MSE) was used to compare the accuracy of the six models. LR produced the highest MSE = 40.1, followed by the unpruned and pruned DT MSE = 39.2 and 34.3, respectively, GBT MSE = 32.5, SVM MSE = 30.5, and RF MSE = 27.3. Furthermore, variable importance from RF identified minimum distance between tumor and spinal cord to be the most important variable for predicting spinal cord Dâ‚‚.
Conclusion: Random forest produced the lowest error in predicting spinal cord DVH point Dâ‚‚, in comparison to other machine learning algorithms, including the commonly used linear regression. Predictions from random forest can be used to evaluate treatment plans.