🤖 AI Summary
This study addresses the clinical challenge of learning dynamic treatment regimes (DTRs) from right-censored survival data. Methodologically, it proposes a censoring-aware tree-structured reinforcement learning framework—the first to integrate censoring modeling (via Kaplan–Meier or Cox regression assistance) with tree-based RL, employ augmented inverse probability weighting (AIPW) for unbiased causal effect estimation, and optimize for restricted mean survival time (RMST) to ensure both statistical robustness and clinical interpretability. Evaluated on the SANAD epilepsy dataset, the method achieves a 12.3% improvement in RMST and an 8.7% gain in decision accuracy over ASCL. Comprehensive experiments—including both simulation studies and real-world validation—demonstrate its stability and generalizability across diverse scenarios.
📝 Abstract
Dynamic Treatment Regimes (DTRs) provide a systematic approach for making sequential treatment decisions that adapt to individual patient characteristics, particularly in clinical contexts where survival outcomes are of interest. Censoring-Aware Tree-Based Reinforcement Learning (CA-TRL) is a novel framework to address the complexities associated with censored data when estimating optimal DTRs. We explore ways to learn effective DTRs, from observational data. By enhancing traditional tree-based reinforcement learning methods with augmented inverse probability weighting (AIPW) and censoring-aware modifications, CA-TRL delivers robust and interpretable treatment strategies. We demonstrate its effectiveness through extensive simulations and real-world applications using the SANAD epilepsy dataset, where it outperformed the recently proposed ASCL method in key metrics such as restricted mean survival time (RMST) and decision-making accuracy. This work represents a step forward in advancing personalized and data-driven treatment strategies across diverse healthcare settings.