🤖 AI Summary
This study addresses the instability in estimating heterogeneous long-term treatment effects that arises from insufficient overlap between treatment and outcome distributions when integrating short-term experimental data with long-term observational data. To tackle this challenge, the authors propose LT-O-Learner, a novel method that employs customized overlap weights to downweight poorly overlapped samples and incorporates Neyman orthogonality into its loss function to ensure robust estimation. As the first orthogonal learner provably robust to low overlap, LT-O-Learner accommodates arbitrary machine learning models and achieves near-oracle convergence rates. Extensive experiments on synthetic and semi-synthetic datasets demonstrate its superior estimation stability and performance under low-overlap conditions, with empirical results corroborating the theoretical error bounds.
📝 Abstract
Estimation of heterogeneous long-term treatment effects (HLTEs) is widely used for personalized decision-making in marketing, economics, and medicine, where short-term randomized experiments are often combined with long-term observational data. However, HLTE estimation is challenging due to limited overlap in treatment or in observing long-term outcomes for certain subpopulations, which can lead to unstable HLTE estimates with large finite-sample variance. To address this challenge, we introduce the LT-O-learners (Long-Term Orthogonal Learners), a set of novel orthogonal learners for HLTE estimation. The learners are designed for the canonical HLTE setting that combines a short-term randomized dataset $\mathcal{D}_1$ with a long-term historical dataset $\mathcal{D}_2$. The key idea of our LT-O-Learners is to retarget the learning objective by introducing custom overlap weights that downweight samples with low overlap in treatment or in long-term observation. We show that the retargeted loss is equivalent to the weighted oracle loss and satisfies Neyman-orthogonality, which means our learners are robust to errors in the nuisance estimation. We further provide a general error bound for the LT-O-Learners and give the conditions under which quasi-oracle rate can be achieved. Finally, our LT-O-learners are model-agnostic and can thus be instantiated with arbitrary machine learning models. We conduct empirical evaluations on synthetic and semi-synthetic benchmarks to confirm the theoretical properties of our LT-O-Learners, especially the robustness in low-overlap settings. To the best of our knowledge, ours are the first orthogonal learners for HLTE estimation that are robust to low overlap that is common in long-term outcomes.