Nearly Optimal Bayesian Inference for Structural Missingness

📅 2026-01-26

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the challenge of structural missing data arising from causal or logical constraints—often dependent on observed variables, unobserved confounders, and other missingness indicators—falling under the Missing Not At Random (MNAR) mechanism. Conventional imputation methods frequently introduce bias and yield overconfident predictions in such settings. To overcome these limitations, the authors propose a Bayesian decoupling framework that first leverages prior knowledge encoded in a Structural Causal Model (SCM) to infer the posterior distribution of missing values, then decouples this imputation process from downstream predictive tasks, enabling plug-and-play propagation of uncertainty. The approach achieves state-of-the-art performance across 43 classification tasks and 15 imputation benchmarks, while offering near-Bayes-optimal guarantees under limited sample regimes.

Technology Category

Application Category

📝 Abstract

Structural missingness breaks'just impute and train': values can be undefined by causal or logical constraints, and the mask may depend on observed variables, unobserved variables (MNAR), and other missingness indicators. It simultaneously brings (i) a catch-22 situation with causal loop, prediction needs the missing features, yet inferring them depends on the missingness mechanism, (ii) under MNAR, the unseen are different, the missing part can come from a shifted distribution, and (iii) plug-in imputation, a single fill-in can lock in uncertainty and yield overconfident, biased decisions. In the Bayesian view, prediction via the posterior predictive distribution integrates over the full model posterior uncertainty, rather than relying on a single point estimate. This framework decouples (i) learning an in-model missing-value posterior from (ii) label prediction by optimizing the predictive posterior distribution, enabling posterior integration. This decoupling yields an in-model almost-free-lunch: once the posterior is learned, prediction is plug-and-play while preserving uncertainty propagation. It achieves SOTA on 43 classification and 15 imputation benchmarks, with finite-sample near Bayes-optimality guarantees under our SCM prior.

Problem

Research questions and friction points this paper is trying to address.

structural missingness

MNAR

causal loop

distribution shift

uncertainty propagation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian inference

structural missingness

posterior predictive distribution