Missing At Random as Covariate Shift: Correcting Bias in Iterative Imputation

πŸ“… 2026-02-06
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This study addresses the covariate shift bias arising in missing data imputation due to distributional discrepancies between observed and unobserved data. It is the first to formally frame imputation bias under the Missing-at-Random (MAR) mechanism as a covariate shift problem. The authors propose a risk minimization framework based on importance weighting, which jointly and iteratively estimates importance weights and the imputation model to dynamically correct this bias. Theoretical analysis demonstrates that the proposed weighting scheme effectively mitigates covariate shift. Empirical evaluations on multiple benchmark datasets confirm the method’s superiority: compared to unweighted approaches, it reduces root mean squared error by up to 7% and Wasserstein distance by up to 20%.

Technology Category

Application Category

πŸ“ Abstract
Accurate imputation of missing data is critical to downstream machine learning performance. We formulate missing data imputation as a risk minimisation problem, which highlights a covariate shift between the observed and unobserved data distributions. This covariate shift induced bias is not accounted for by popular imputation methods and leads to suboptimal performance. In this paper, we derive theoretically valid importance weights that correct for the induced distributional bias. Furthermore, we propose a novel imputation algorithm that jointly estimates both the importance weights and imputation models, enabling bias correction throughout the imputation process. Empirical results across benchmark datasets show reductions in root mean squared error and Wasserstein distance of up to 7% and 20%, respectively, compared to otherwise identical unweighted methods.
Problem

Research questions and friction points this paper is trying to address.

missing data imputation
covariate shift
distributional bias
iterative imputation
missing at random
Innovation

Methods, ideas, or system contributions that make the work stand out.

covariate shift
importance weighting
iterative imputation
missing at random
bias correction
πŸ”Ž Similar Papers
No similar papers found.
L
Luke Shannon
School of Mathematics, University of Bristol, Bristol, United Kingdom
Song Liu
Song Liu
Associate Professor, University of Bristol, UK
Statistical Machine Learning
K
Katarzyna Reluga
School of Business and Economics, Humboldt University of Berlin, Berlin, Germany