Actor Critic with Experience Replay-based automatic treatment planning for prostate cancer intensity modulated radiotherapy

📅 2025-02-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Intensity-modulated radiation therapy (IMRT) inverse planning for prostate cancer suffers from heavy reliance on large-scale annotated datasets, poor generalizability, and insufficient robustness. Method: This paper proposes a deep reinforcement learning framework based on Actor-Critic with Experience Replay (ACER), the first to integrate experience replay into automated radiotherapy treatment planning. It enables end-to-end parameter optimization and plan generation using only a single patient case, eliminating the need for extensive labeled data. Clinical constraints are embedded via dose-volume histogram (DVH) modeling, and adversarial robustness is validated using Fast Gradient Sign Method (FGSM) attacks. Results: Evaluated on over 300 test cases, the method achieves ProKnow scores of 9 (full marks) for 93.09% of plans, with a mean score of 8.93 ± 0.27—significantly outperforming the baseline (6.20 ± 1.84). It demonstrates strong few-shot generalization and robustness against adversarial perturbations.

Technology Category

Application Category

📝 Abstract
Background: Real-time treatment planning in IMRT is challenging due to complex beam interactions. AI has improved automation, but existing models require large, high-quality datasets and lack universal applicability. Deep reinforcement learning (DRL) offers a promising alternative by mimicking human trial-and-error planning. Purpose: Develop a stochastic policy-based DRL agent for automatic treatment planning with efficient training, broad applicability, and robustness against adversarial attacks using Fast Gradient Sign Method (FGSM). Methods: Using the Actor-Critic with Experience Replay (ACER) architecture, the agent tunes treatment planning parameters (TPPs) in inverse planning. Training is based on prostate cancer IMRT cases, using dose-volume histograms (DVHs) as input. The model is trained on a single patient case, validated on two independent cases, and tested on 300+ plans across three datasets. Plan quality is assessed using ProKnow scores, and robustness is tested against adversarial attacks. Results: Despite training on a single case, the model generalizes well. Before ACER-based planning, the mean plan score was 6.20$pm$1.84; after, 93.09% of cases achieved a perfect score of 9, with a mean of 8.93$pm$0.27. The agent effectively prioritizes optimal TPP tuning and remains robust against adversarial attacks. Conclusions: The ACER-based DRL agent enables efficient, high-quality treatment planning in prostate cancer IMRT, demonstrating strong generalizability and robustness.
Problem

Research questions and friction points this paper is trying to address.

Deep Reinforcement Learning
Prostate Cancer IMRT Planning
Adaptability and Robustness in Treatment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Deep Reinforcement Learning
Actor-Critic with Experience Replay (ACER)
Intensity-Modulated Radiation Therapy (IMRT) Planning
🔎 Similar Papers
No similar papers found.
M
Md Mainul Abrar
Department of Physics, The University of Texas at Arlington, Arlington, TX.
P
Parvat Sapkota
Department of Physics, The University of Texas at Arlington, Arlington, TX.
D
Damon Sprouts
Department of Physics, The University of Texas at Arlington, Arlington, TX.
Xun Jia
Xun Jia
Department of Radiation Oncology and Molecular Radiation Sciences, Johns Hopkins University
Radiation therapy physicsDeep learning in Radiation therapyCone-beam CT reconstructionMonte Carlo radiation transport simu
Y
Yujie Chi
Department of Physics, The University of Texas at Arlington, Arlington, TX.