Click here to


Are you sure ?

Yes, do it No, cancel

Dose-Volume Histogram Based Inverse Treatment Planning Via Deep Reinforcement Learning in Intensity Modulated Radiation Therapy (IMRT) for Prostate Cancer

D Sprouts1*, Y Chi1, C Shen2, X Jia2, (1)University of Texas at Arlington, Arlington, TX, (2) University of Texas Southwestern Medical Center, Dallas, TX


(Sunday, 7/12/2020)   [Eastern Time (GMT-4)]

Room: AAPM ePoster Library

Purpose: histogram (DVH)-based optimization engine is widely used in modern treatment planning systems (TPSs). Extensive manual efforts from human planers are required to operate TPSs in the planning process for clinically acceptable plans. To reduce these repetitive hanuman efforts, in this study, we developed a deep neural network (DNN)-based virtual treatment planner (VTP) via end-to-end deep reinforcement learning (DRL) to automatically operate the DVH-based optimization engine for high-quality treatment plans.

Methods: We considered intensity modulated radiotherapy (IMRT) treatment planning for prostate cancer as a testbed of the proposed framework. Similar to human planners, VTP repetitively observes DVH of intermediate treatment plans and operates the in-house developed DVH-based TPS by adjusting treatment planning parameters (TPPs), such as changing volume and dose constraints, as well as the corresponding weights, to improve the plan quality. We trained the VTP via end-to-end DRL with an experience replay mechanism under the Q-learning framework. Epsilon-greedy algorithm was implemented to explore the impacts of taking different actions for a large number of automatically generated plans, from which an optimal policy to improve plan quality can be learned.

Results: VTP was successfully trained using 10 patient cases, and was then tested on another 12. For all the testing cases, random initial TPPs are assigned. The average initial plan score was 5. The VTP was able to automatically adjust the TPPs and the average plan score was improved to 9 out of a full score of 10.

Conclusion: trained VTP is capable of automatically producing high-quality IMRT plans for prostate cancer by operating a DVH-based plan optimization engine intelligently in a human-like manner. The proposed approach is generally applicable to other cancer sites and treatment techniques. It has the potential to be incorporated into the commercial TPSs to fully automate the treatment planning process.


Not Applicable / None Entered.


Not Applicable / None Entered.

Contact Email