Markov Missing Graph: A Graphical Approach for Missing Data Imputation

📅 2025-09-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the limited flexibility and identifiability of imputation models under missing data. We propose the Markov Missingness Graph (MMG) framework, which encodes the conditional independence structure of the missingness pattern via an undirected graph to enable local decomposition of the imputation model. Integrated with the Principle of Available Information (PAI), imputation is formulated as an empirical risk minimization problem, permitting arbitrary predictive models. Theoretically, we establish— for the first time—the identifiability conditions under MMG, clarifying its equivalence to the Missing at Random (MAR) assumption and characterizing its relaxation boundaries. Methodologically, we introduce a scalable, graph-guided learning paradigm. Extensive simulations and application to real-world Alzheimer’s disease data demonstrate both statistical validity and computational efficiency.

Technology Category

Application Category

📝 Abstract
We introduce the Markov missing graph (MMG), a novel framework that imputes missing data based on undirected graphs. MMG leverages conditional independence relationships to locally decompose the imputation model. To establish the identification, we introduce the Principle of Available Information (PAI), which guides the use of all relevant observed data. We then propose a flexible statistical learning paradigm, MMG Imputation Risk Minimization under PAI, that frames the imputation task as an empirical risk minimization problem. This framework is adaptable to various modeling choices. We develop theories of MMG, including the connection between MMG and Little's complete-case missing value assumption, recovery under missing completely at random, efficiency theory, and graph-related properties. We show the validity of our method with simulation studies and illustrate its application with a real-world Alzheimer's data set.
Problem

Research questions and friction points this paper is trying to address.

Imputing missing data using undirected graphical models
Leveraging conditional independence relationships for local decomposition
Establishing identification through Principle of Available Information
Innovation

Methods, ideas, or system contributions that make the work stand out.

Undirected graph-based missing data imputation
Conditional independence for local model decomposition
Empirical risk minimization under information principle
🔎 Similar Papers
No similar papers found.
Y
Yanjiao Yang
Department of Statistics, University of Washington, Seattle, WA 98195-4322, USA
Yen-Chi Chen
Yen-Chi Chen
Department of Statistics, University of Washington
Nonparametric StatisticsMissing DataClusteringAstrostatistics