Room: AAPM ePoster Library
Purpose: To introduce a self-learning deep neural network to predict a set of beam orientations that can outperform a current state-of-the-art optimization method, column generation (CG), for intensity modulated radiation therapy (IMRT). The goal is to design such method that can develop a high quality plan and its application can be extended to complex problems in radiation, such as 4p radiation therapy and proton therapy.
Methods: A deep reinforcement learning based neural network is designed such that is has the ability to self-improve over time. The proposed method starts with a previously trained deep neural network (DNN), that has been trained to mimic CG performance by iteratively selecting one beam at a time. Starting with this pre-trained model, a self-improving tree structure is embedded after the selection of each set of beam orientations to later improve and update the current model. To do so, a tree based structure is added to the model that can learn and improve its own beam selection policy. Each tree will use the current model to traverse through the decision space and later update the current model based on the improved results.
Results: After reinforcement-learning model (RL) is trained, it can be used to predict a user-defined number of beam orientations in less than 2 seconds. Although this algorithm is still in its training stages, our preliminary results show more than 54% and 74% improvement in the objective function of the plan generated based on the beams chosen by RL compare to CG and DNN respectively.
Conclusion: We propose a strong and fast reinforcement learning method to help for the selection of beam orientations in prostate cancer patients. This model can find this set of beam orientations in at most 2 seconds.
Funding Support, Disclosures, and Conflict of Interest: National Institutes of Health (NIH) R01CA237269
Not Applicable / None Entered.