Automatic Treatment Planning using Reinforcement Learning for High-dose-rate Prostate Brachytherapy

📅 2025-06-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In HDR prostate brachytherapy, needle placement planning heavily relies on physician expertise, resulting in low efficiency and inconsistent plan quality. Method: This study introduces deep reinforcement learning (specifically the Proximal Policy Optimization algorithm) for fully automated treatment planning—extracting anatomical features from preoperative imaging, employing a multi-round, needle-by-needle optimization strategy, and incorporating a clinically informed dosimetric reward function. Results: The proposed method achieves equivalent performance to manual plans for prostate V100 and rectal D2cc (p > 0.05), while significantly outperforming them for prostate V150 and urethral D20% (p < 0.05). It reduces the average number of needles by two, substantially decreases inter-physician variability, and improves both planning consistency and efficiency.

Technology Category

Application Category

📝 Abstract
Purpose: In high-dose-rate (HDR) prostate brachytherapy procedures, the pattern of needle placement solely relies on physician experience. We investigated the feasibility of using reinforcement learning (RL) to provide needle positions and dwell times based on patient anatomy during pre-planning stage. This approach would reduce procedure time and ensure consistent plan quality. Materials and Methods: We train a RL agent to adjust the position of one selected needle and all the dwell times on it to maximize a pre-defined reward function after observing the environment. After adjusting, the RL agent then moves on to the next needle, until all needles are adjusted. Multiple rounds are played by the agent until the maximum number of rounds is reached. Plan data from 11 prostate HDR boost patients (1 for training, and 10 for testing) treated in our clinic were included in this study. The dosimetric metrics and the number of used needles of RL plan were compared to those of the clinical results (ground truth). Results: On average, RL plans and clinical plans have very similar prostate coverage (Prostate V100) and Rectum D2cc (no statistical significance), while RL plans have less prostate hotspot (Prostate V150) and Urethra D20% plans with statistical significance. Moreover, RL plans use 2 less needles than clinical plan on average. Conclusion: We present the first study demonstrating the feasibility of using reinforcement learning to autonomously generate clinically practical HDR prostate brachytherapy plans. This RL-based method achieved equal or improved plan quality compared to conventional clinical approaches while requiring fewer needles. With minimal data requirements and strong generalizability, this approach has substantial potential to standardize brachytherapy planning, reduce clinical variability, and enhance patient outcomes.
Problem

Research questions and friction points this paper is trying to address.

Automating needle placement in prostate brachytherapy using RL
Reducing procedure time and ensuring consistent plan quality
Improving dosimetric metrics with fewer needles than clinical plans
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning optimizes needle positions
RL agent adjusts dwell times autonomously
Fewer needles with equal or better quality
🔎 Similar Papers
Tonghe Wang
Tonghe Wang
Memorial Sloan Kettering Cancer Center
Medical PhysicsCT ImagingRadiation Therapy
Y
Yining Feng
Department of Medical Physics, Memorial Sloan Kettering Cancer Center, New York, NY, 10065
X
Xiaofeng Yang
Department of Radiation Oncology and Winship Cancer Institute, Emory University, Atlanta, GA 30322; Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University School of Medicine, Atlanta, GA, 30332; Department of Biomedical Informatics, Emory University, Atlanta, GA 30322