🤖 AI Summary
This work addresses the challenges in estimating heterogeneous treatment effects (HTE) in survival data with right censoring, where issues such as censoring, unobservable counterfactuals, complex identification assumptions, and the absence of standardized evaluation protocols hinder progress. To bridge this gap, we introduce SurvHTE-Bench, the first comprehensive benchmark for HTE estimation in causal survival analysis. It integrates synthetic, semi-synthetic, and real-world datasets—including twin studies and HIV clinical trials—and systematically evaluates prominent methods (e.g., Causal Survival Forests, survival meta-learners) under diverse causal assumptions and realistic biases. Our framework provides the first reproducible, modular evaluation environment with known ground-truth treatment effects, revealing the robustness and limitations of existing approaches under assumption violations and establishing a standardized foundation for future research in causal survival analysis.
📝 Abstract
Estimating heterogeneous treatment effects (HTEs) from right-censored survival data is critical in high-stakes applications such as precision medicine and individualized policy-making. Yet, the survival analysis setting poses unique challenges for HTE estimation due to censoring, unobserved counterfactuals, and complex identification assumptions. Despite recent advances, from Causal Survival Forests to survival meta-learners and outcome imputation approaches, evaluation practices remain fragmented and inconsistent. We introduce SurvHTE-Bench, the first comprehensive benchmark for HTE estimation with censored outcomes. The benchmark spans (i) a modular suite of synthetic datasets with known ground truth, systematically varying causal assumptions and survival dynamics, (ii) semi-synthetic datasets that pair real-world covariates with simulated treatments and outcomes, and (iii) real-world datasets from a twin study (with known ground truth) and from an HIV clinical trial. Across synthetic, semi-synthetic, and real-world settings, we provide the first rigorous comparison of survival HTE methods under diverse conditions and realistic assumption violations. SurvHTE-Bench establishes a foundation for fair, reproducible, and extensible evaluation of causal survival methods. The data and code of our benchmark are available at: https://github.com/Shahriarnz14/SurvHTE-Bench .