🤖 AI Summary
This work addresses the bias arising from coupling between data and missingness patterns in imputation. We propose a novel framework that fundamentally minimizes mutual information between observed data and the missingness mask—a first-of-its-kind formulation of imputation as mutual information minimization. We rigorously derive that this objective is equivalent to solving a specific ordinary differential equation (ODE), thereby unifying and interpreting diverse state-of-the-art methods under a common theoretical lens. Our method constructs a learnable velocity field via Rectified Flows and jointly optimizes KL divergence alongside a decomposition of joint–marginal distributions, integrating principles from generative modeling and information theory. Extensive experiments on synthetic benchmarks and multiple real-world datasets demonstrate consistent and significant improvements over current SOTA approaches, validating both theoretical soundness and empirical efficacy.
📝 Abstract
This paper introduces a novel iterative method for missing data imputation that sequentially reduces the mutual information between data and their corresponding missing mask. Inspired by GAN-based approaches, which train generators to decrease the predictability of missingness patterns, our method explicitly targets the reduction of mutual information. Specifically, our algorithm iteratively minimizes the KL divergence between the joint distribution of the imputed data and missing mask, and the product of their marginals from the previous iteration. We show that the optimal imputation under this framework corresponds to solving an ODE, whose velocity field minimizes a rectified flow training objective. We further illustrate that some existing imputation techniques can be interpreted as approximate special cases of our mutual-information-reducing framework. Comprehensive experiments on synthetic and real-world datasets validate the efficacy of our proposed approach, demonstrating superior imputation performance.