Identification and Estimation of Long-Term Treatment Effects with Monotone Missing

📅 2025-04-28

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This paper addresses the pervasive monotone missingness problem—where subjects with early missing outcomes exhibit complete absence of subsequent observations—in long-term treatment effect estimation. We first systematically establish a causal identification framework under monotone missingness, introducing the sequential missingness assumption to ensure identifiability. We then propose three novel methodological families: (i) inverse probability weighting (IPW)-based estimators, (ii) sequential regression imputation approaches, and (iii) a doubly innovative strategy combining sequential marginal structural models (SeqMSMs) with a deep balancing network (BalanceNet). BalanceNet enhances covariate balance to substantially improve stability and accuracy under sparse missingness patterns; on two benchmark datasets, it reduces estimation variance by up to 37% relative to state-of-the-art baselines. Our framework delivers a scalable, robust paradigm for long-term causal inference under realistic missing-data mechanisms.

Technology Category

Application Category

📝 Abstract

Estimating long-term treatment effects has a wide range of applications in various domains. A key feature in this context is that collecting long-term outcomes typically involves a multi-stage process and is subject to monotone missing, where individuals missing at an earlier stage remain missing at subsequent stages. Despite its prevalence, monotone missing has been rarely explored in previous studies on estimating long-term treatment effects. In this paper, we address this gap by introducing the sequential missingness assumption for identification. We propose three novel estimation methods, including inverse probability weighting, sequential regression imputation, and sequential marginal structural model (SeqMSM). Considering that the SeqMSM method may suffer from high variance due to severe data sparsity caused by monotone missing, we further propose a novel balancing-enhanced approach, BalanceNet, to improve the stability and accuracy of the estimation methods. Extensive experiments on two widely used benchmark datasets demonstrate the effectiveness of our proposed methods.

Problem

Research questions and friction points this paper is trying to address.

Estimating long-term treatment effects with monotone missing data

Addressing data sparsity in multi-stage outcome collection

Improving estimation stability via balancing-enhanced methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Sequential missingness assumption for identification

BalanceNet enhances SeqMSM stability and accuracy

Three novel estimation methods for monotone missing

🔎 Similar Papers

Estimating Long-term Heterogeneous Dose-response Curve: Generalization Bound Leveraging Optimal Transport Weights

2024-06-27arXiv.orgCitations: 3

💼 Related Jobs

Research Engineer, Monetization AI