🤖 AI Summary
Masked Leave-One-Out (MOO) evaluation focuses solely on prediction accuracy, neglecting the ability to model imputation randomness. Method: We propose three novel MOO criteria—based on rank transformation, energy distance, and likelihood principles—to jointly quantify distributional fidelity and predictive utility of imputations. We further develop a two-dimensional visualization framework that exposes the theoretical connection between MOO and the Missing-At-Random (MAR) assumption, and establish a model selection theory for imputation with statistical consistency guarantees. Contributions: Integrating semiparametric efficiency theory, the Bayesian Information Criterion (BIC), and statistical learning techniques, our approach ensures both asymptotic consistency and computational feasibility. The visualization tool enables intuitive, multi-model performance comparison. Overall, we introduce an interpretable, verifiable paradigm for principled imputation model selection.
📝 Abstract
The masking-one-out (MOO) procedure, masking an observed entry and comparing it versus its imputed values, is a very common procedure for comparing imputation models. We study the optimum of this procedure and generalize it to a missing data assumption and establish the corresponding semi-parametric efficiency theory. However, MOO is a measure of prediction accuracy, which is not ideal for evaluating an imputation model. To address this issue, we introduce three modified MOO criteria, based on rank transformation, energy distance, and likelihood principle, that allow us to select an imputation model that properly account for the stochastic nature of data. The likelihood approach further enables an elegant framework of learning an imputation model from the data and we derive its statistical and computational learning theories as well as consistency of BIC model selection. We also show how MOO is related to the missing-at-random assumption. Finally, we introduce the prediction-imputation diagram, a two-dimensional diagram visually comparing both the prediction and imputation utilities for various imputation models.