Weighting-Based Identification and Estimation in Graphical Models of Missing Data

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses selection bias arising from missing data in graphical models by proposing an intervention-based framework for identification and estimation. Treating missingness indicators as intervenable variables, the authors introduce a novel tree-based identification algorithm to explicitly characterize the propagation pathways of selection bias and leverage do-calculus to assess the identifiability of target functionals. Building on this foundation, they develop a recursive inverse probability weighting approach to construct efficient estimating equations that jointly model the missingness mechanism and parameters of interest. The accompanying R package, flexMissing, enables end-to-end analysis, and both simulation studies and real-data applications demonstrate that the proposed method accurately determines identifiability and yields robust estimates.

Technology Category

Application Category

📝 Abstract
We propose a constructive algorithm for identifying complete data distributions in graphical models of missing data. The complete data distribution is unrestricted, while the missingness mechanism is assumed to factorize according to a conditional directed acyclic graph. Our approach follows an interventionist perspective in which missingness indicators are treated as variables that can be intervened on. A central challenge in this setting is that sequences of interventions on missingness indicators may induce and propagate selection bias, so that identification can fail even when a propensity score is invariant to available interventions. To address this challenge, we introduce a tree-based identification algorithm that explicitly tracks the creation and propagation of selection bias and determines whether it can be avoided through admissible intervention strategies. The resulting tree provides both a diagnostic and a constructive characterization of identifiability under a given missingness mechanism. Building on these results, we develop recursive inverse probability weighting procedures that mirror the intervention logic of the identification algorithm, yielding valid estimating equations for both the missingness mechanism and functionals of the complete data distribution. Simulation studies and a real-data application illustrate the practical performance of the proposed methods. An accompanying R package, flexMissing, implements all proposed procedures.
Problem

Research questions and friction points this paper is trying to address.

missing data
graphical models
selection bias
identifiability
intervention
Innovation

Methods, ideas, or system contributions that make the work stand out.

interventionist perspective
selection bias propagation
tree-based identification algorithm
recursive inverse probability weighting
graphical models of missing data