🤖 AI Summary
This work addresses the challenge of missing not at random (MNAR)—where the missingness mechanism depends on unobserved values—by proposing PRDIM, a novel framework that explicitly models and leverages missing pattern information within a diffusion-based imputation architecture. PRDIM integrates a diffusion generative model, the expectation-maximization (EM) algorithm, and a dedicated missing pattern recognition network to iteratively optimize the joint distribution of observed data and missing masks, thereby enabling accurate imputation under MNAR conditions. Experimental results demonstrate that PRDIM consistently outperforms existing methods across diverse data modalities, achieving robust and superior imputation performance.
📝 Abstract
Missing data frequently arises across diverse domains, including time-series and image domains. In the real world, missing occurrences often depend on the unobservable values themselves, which are referred to as Missing Not at Random (MNAR). In this work, we introduce the Missing Pattern Recognized Diffusion Imputation Model (PRDIM), a novel framework that explicitly captures the missing pattern and precisely imputes unobserved values. PRDIM iteratively maximizes the likelihood of the joint distribution for observed values and missing mask under an Expectation-Maximization (EM) algorithm. In this sense, we first employ a pattern recognizer, which approximates the underlying missing pattern and provides guidance during every inference toward more plausible imputations with respect to the missing information. Through extensive experiments, we demonstrate that PRDIM consistently achieves strong imputation performance under MNAR settings across multiple data modalities.