TERRA: A Transformer-Enabled Recursive R-learner for Longitudinal Heterogeneous Treatment Effect Estimation

📅 2025-10-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Estimating heterogeneous treatment effects (HTE) under time-varying interventions in longitudinal settings faces challenges including carryover effects, temporal heterogeneity, and post-treatment bias—limiting causal identifiability and dynamic modeling fidelity. To address these, we propose TERRA, a novel framework that synergistically integrates Transformer-based encoding—capturing long-range temporal dependencies among treatments and covariates—with recursive R-learning, which mitigates post-treatment confounding via residual learning and enhances causal identifiability. Unlike conventional nested architectures, TERRA lifts parametric constraints to flexibly model dynamic heterogeneity. Extensive experiments across multiple synthetic benchmarks and real-world healthcare and education datasets demonstrate that TERRA consistently outperforms state-of-the-art HTE methods in estimation accuracy, stability, and out-of-sample generalizability, establishing it as a robust tool for personalized, longitudinal causal inference.

Technology Category

Application Category

📝 Abstract
Accurately estimating heterogeneous treatment effects (HTE) in longitudinal settings is essential for personalized decision-making across healthcare, public policy, education, and digital marketing. However, time-varying interventions introduce many unique challenges, such as carryover effects, time-varying heterogeneity, and post-treatment bias, which are not addressed by standard HTE methods. To address these challenges, we introduce TERRA (Transformer-Enabled Recursive R-learner), which facilitates longitudinal HTE estimation with flexible temporal modeling and learning. TERRA has two components. First, we use a Transformer architecture to encode full treatment-feature histories, enabling the representation of long-range temporal dependencies and carryover effects, hence capturing individual- and time-specific treatment effect variation more comprehensively. Second, we develop a recursive residual-learning formulation that generalizes the classical structural nested mean models (SNMMs) beyond parametric specifications, addressing post-treatment bias while reducing reliance on functional assumptions. In simulations and data applications, TERRA consistently outperforms strong baselines in HTE estimation in both accuracy and stability, highlighting the value of combining principled causal structure with high-capacity sequence models for longitudinal HTE.
Problem

Research questions and friction points this paper is trying to address.

Estimating time-varying treatment effects with carryover and heterogeneity challenges
Addressing post-treatment bias in longitudinal causal inference settings
Modeling long-range temporal dependencies in treatment effect estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Transformer encodes treatment-feature histories for dependencies
Recursive residual-learning generalizes structural nested mean models
Combines causal structure with sequence models for HTE
🔎 Similar Papers
No similar papers found.
L
Lei Shi
Adobe Research, San Jose, CA
Sizhu Lu
Sizhu Lu
PhD student in Statistics, UC Berkeley
causal inference
Q
Qiuran Lyu
Division of Biostatistics, UC Berkeley
P
Peng Ding
Department of Statistics, UC Berkeley
N
Nikos Vlassis
Adobe Research