🤖 AI Summary
This paper addresses the failure of standard difference-in-differences (DiD) estimators for the average treatment effect on the treated (ATT) under missing-at-random (MAR) outcomes. We propose a multiply robust, asymptotically efficient semiparametric estimator. Under nonparametric identification, we construct a doubly robust structure combining inverse probability weighting and outcome regression adjustment, and derive the semiparametric efficiency bound. The estimator is model-robust: consistency holds if either the propensity score model or the outcome regression model is correctly specified. Simulation studies demonstrate that our estimator achieves low bias, high precision, and stable inference across diverse MAR mechanisms—including pre-treatment, post-treatment, and bidirectional missingness—outperforming conventional DiD and singly robust alternatives. The method provides a theoretically rigorous and practically viable tool for causal effect estimation under MAR outcomes.
📝 Abstract
The Difference-in-Differences (DiD) method is a fundamental tool for causal inference, yet its application is often complicated by missing data. Although recent work has developed robust DiD estimators for complex settings like staggered treatment adoption, these methods typically assume complete data and fail to address the critical challenge of outcomes that are missing at random (MAR) -- a common problem that invalidates standard estimators. We develop a rigorous framework, rooted in semiparametric theory, for identifying and efficiently estimating the Average Treatment Effect on the Treated (ATT) when either pre- or post-treatment (or both) outcomes are missing at random. We first establish nonparametric identification of the ATT under two minimal sets of sufficient conditions. For each, we derive the semiparametric efficiency bound, which provides a formal benchmark for asymptotic optimality. We then propose novel estimators that are asymptotically efficient, achieving this theoretical bound. A key feature of our estimators is their multiple robustness, which ensures consistency even if some nuisance function models are misspecified. We validate the properties of our estimators and showcase their broad applicability through an extensive simulation study.