Doubly Robust Fusion of Many Treatments for Policy Learning

📅 2025-05-12

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

In multi-treatment settings, learning individualized treatment rules (ITRs) is challenged by data sparsity and severe covariate distribution imbalance across arms, leading to suboptimal or invalid policies. To address this, we propose a calibrated weighted treatment fusion (CWTF) method that integrates double-robust estimation, penalized working models, and policy tree ensembles—yielding an interpretable, scalable framework for multi-arm ITR learning. Theoretically, CWTF guarantees consistency, oracle property, and a finite regret bound. It innovatively recovers latent treatment subgroup structures and performs subset-specific covariate selection, while incorporating fairness and practicality constraints. Simulation studies demonstrate substantial improvements in subgroup recovery accuracy and policy value over state-of-the-art methods. Empirical validation on real-world electronic health records from chronic lymphocytic leukemia/small lymphocytic lymphoma (CLL/SLL) patients confirms its clinical validity and robustness.

Technology Category

Application Category

📝 Abstract

Individualized treatment rules/recommendations (ITRs) aim to improve patient outcomes by tailoring treatments to the characteristics of each individual. However, when there are many treatment groups, existing methods face significant challenges due to data sparsity within treatment groups and highly unbalanced covariate distributions across groups. To address these challenges, we propose a novel calibration-weighted treatment fusion procedure that robustly balances covariates across treatment groups and fuses similar treatments using a penalized working model. The fusion procedure ensures the recovery of latent treatment group structures when either the calibration model or the outcome model is correctly specified. In the fused treatment space, practitioners can seamlessly apply state-of-the-art ITR learning methods with the flexibility to utilize a subset of covariates, thereby achieving robustness while addressing practical concerns such as fairness. We establish theoretical guarantees, including consistency, the oracle property of treatment fusion, and regret bounds when integrated with multi-armed ITR learning methods such as policy trees. Simulation studies show superior group recovery and policy value compared to existing approaches. We illustrate the practical utility of our method using a nationwide electronic health record-derived de-identified database containing data from patients with Chronic Lymphocytic Leukemia and Small Lymphocytic Lymphoma.

Problem

Research questions and friction points this paper is trying to address.

Address data sparsity in many treatment groups

Balance covariate distributions across treatment groups

Fuse similar treatments for robust ITR learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Calibration-weighted treatment fusion for covariate balance

Penalized working model to fuse similar treatments

Flexible covariate subset use in ITR learning

🔎 Similar Papers

Don't flatten, tokenize! Unlocking the key to SoftMoE's efficacy in deep RL