Room: Exhibit Hall | Forum 7
Purpose: Due to the large combinatorial problem, current beam orientation optimization algorithms for radiotherapy, such as column generation (CG), are typically heuristic or greedy in nature, leading to suboptimal solutions. We propose a reinforcement learning strategy using Monte Carlo Tree Search capable of finding a superior beam orientation set and in less time than CG.
Methods: We utilized a reinforcement learning structure involving a supervised learning network to guide Monte Carlo tree search to explore the decision space of beam orientation selection problem. We have previously trained a deep neural network (DNN) that takes in the patient anatomy, organ weights, and current beams, and then approximates beam fitness values, indicating the next best beam to add. This DNN is used to probabilistically guide the traversal of the branches of the Monte Carlo decision tree to add a new beam to the plan. To test the feasibility of the algorithm, we solved for 5-beam plans, using 13 test prostate cancer patients, different from the 57 training and validation patients originally trained the DNN.
Results: On average, the CG algorithm needed 700 seconds to find its solution, while the proposed method found solutions with higher quality (lower final objective value), in 100 seconds on average. Using our guided tree search (GTS) method we were able to maintain a similar planning target volume (PTV) coverage within 2% error, and reduce the organ at risk (OAR) mean dose by 0.10 Â± 0.08% (body), 2.44 Â± 2.01%(rectum), 4.94 Â± 4.65%(L_fem_head), 6.40Â±3.94%(R_fem_head), of the prescription dose, but a slight increase of 1.31 Â± 1.96% in bladder mean dose.
Conclusion: In this study we demonstrate that our GTS method produces a superior plan to CG, in less time than it takes to solve the CG algorithm and therefore is suitable for clinical application and time sensitive problems.
Not Applicable / None Entered.