Efficient Difference-in-Differences Estimation when Outcomes are Missing at Random

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

This paper addresses the failure of standard difference-in-differences (DiD) estimators for the average treatment effect on the treated (ATT) under missing-at-random (MAR) outcomes. We propose a multiply robust, asymptotically efficient semiparametric estimator. Under nonparametric identification, we construct a doubly robust structure combining inverse probability weighting and outcome regression adjustment, and derive the semiparametric efficiency bound. The estimator is model-robust: consistency holds if either the propensity score model or the outcome regression model is correctly specified. Simulation studies demonstrate that our estimator achieves low bias, high precision, and stable inference across diverse MAR mechanisms—including pre-treatment, post-treatment, and bidirectional missingness—outperforming conventional DiD and singly robust alternatives. The method provides a theoretically rigorous and practically viable tool for causal effect estimation under MAR outcomes.

Technology Category

Application Category

📝 Abstract

The Difference-in-Differences (DiD) method is a fundamental tool for causal inference, yet its application is often complicated by missing data. Although recent work has developed robust DiD estimators for complex settings like staggered treatment adoption, these methods typically assume complete data and fail to address the critical challenge of outcomes that are missing at random (MAR) -- a common problem that invalidates standard estimators. We develop a rigorous framework, rooted in semiparametric theory, for identifying and efficiently estimating the Average Treatment Effect on the Treated (ATT) when either pre- or post-treatment (or both) outcomes are missing at random. We first establish nonparametric identification of the ATT under two minimal sets of sufficient conditions. For each, we derive the semiparametric efficiency bound, which provides a formal benchmark for asymptotic optimality. We then propose novel estimators that are asymptotically efficient, achieving this theoretical bound. A key feature of our estimators is their multiple robustness, which ensures consistency even if some nuisance function models are misspecified. We validate the properties of our estimators and showcase their broad applicability through an extensive simulation study.

Problem

Research questions and friction points this paper is trying to address.

Estimating causal effects with missing outcome data under random missingness

Developing efficient estimators for treatment effects when outcomes are missing

Ensuring robust estimation despite potential model misspecification in DiD

Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient ATT estimation with missing outcomes

Multiple robustness against model misspecification

Semiparametric efficiency bound achievement

🔎 Similar Papers

No similar papers found.