🤖 AI Summary
Estimating heterogeneous treatment effects (HTE) under time-varying interventions in longitudinal settings faces challenges including carryover effects, temporal heterogeneity, and post-treatment bias—limiting causal identifiability and dynamic modeling fidelity. To address these, we propose TERRA, a novel framework that synergistically integrates Transformer-based encoding—capturing long-range temporal dependencies among treatments and covariates—with recursive R-learning, which mitigates post-treatment confounding via residual learning and enhances causal identifiability. Unlike conventional nested architectures, TERRA lifts parametric constraints to flexibly model dynamic heterogeneity. Extensive experiments across multiple synthetic benchmarks and real-world healthcare and education datasets demonstrate that TERRA consistently outperforms state-of-the-art HTE methods in estimation accuracy, stability, and out-of-sample generalizability, establishing it as a robust tool for personalized, longitudinal causal inference.
📝 Abstract
Accurately estimating heterogeneous treatment effects (HTE) in longitudinal settings is essential for personalized decision-making across healthcare, public policy, education, and digital marketing. However, time-varying interventions introduce many unique challenges, such as carryover effects, time-varying heterogeneity, and post-treatment bias, which are not addressed by standard HTE methods. To address these challenges, we introduce TERRA (Transformer-Enabled Recursive R-learner), which facilitates longitudinal HTE estimation with flexible temporal modeling and learning. TERRA has two components. First, we use a Transformer architecture to encode full treatment-feature histories, enabling the representation of long-range temporal dependencies and carryover effects, hence capturing individual- and time-specific treatment effect variation more comprehensively. Second, we develop a recursive residual-learning formulation that generalizes the classical structural nested mean models (SNMMs) beyond parametric specifications, addressing post-treatment bias while reducing reliance on functional assumptions. In simulations and data applications, TERRA consistently outperforms strong baselines in HTE estimation in both accuracy and stability, highlighting the value of combining principled causal structure with high-capacity sequence models for longitudinal HTE.