🤖 AI Summary
This paper addresses the identification of heterogeneous treatment effects (HTE) in observational studies with unobserved confounding of the treatment-outcome relationship, yet where the mediator is unconfounded. We propose two novel debiased learning methods that systematically integrate the front-door adjustment into the debiased machine learning framework—marking the first such application. Our approaches achieve near-oracle convergence rates and maintain robustness even when auxiliary models converge at slower, nonparametric rates. Methodologically, they unify front-door identification, the R-learner, and the DR-learner frameworks, incorporating nonparametric estimation and rigorous error analysis. Extensive evaluation on synthetic data and the real-world Fatality Analysis Reporting System (FARS) dataset—assessing the heterogeneous causal impact of seat belt laws on traffic fatality rates—demonstrates both high accuracy and sample efficiency. The implementation is publicly available.
📝 Abstract
In observational settings where treatment and outcome share unmeasured confounders but an observed mediator remains unconfounded, the front-door (FD) adjustment identifies causal effects through the mediator. We study the heterogeneous treatment effect (HTE) under FD identification and introduce two debiased learners: FD-DR-Learner and FD-R-Learner. Both attain fast, quasi-oracle rates (i.e., performance comparable to an oracle that knows the nuisances) even when nuisance functions converge as slowly as n^-1/4. We provide error analyses establishing debiasedness and demonstrate robust empirical performance in synthetic studies and a real-world case study of primary seat-belt laws using Fatality Analysis Reporting System (FARS) dataset. Together, these results indicate that the proposed learners deliver reliable and sample-efficient HTE estimates in FD scenarios. The implementation is available at https://github.com/yonghanjung/FD-CATE.
Keywords: Front-door adjustment; Heterogeneous treatment effects; Debiased learning; Quasi-oracle rates; Causal inference.