🤖 AI Summary
This study addresses the limited robustness of traditional prediction methods under data contamination, particularly when residuals are sensitive and influence functions are unbounded. To overcome this, the authors integrate Bayesian and conformal prediction frameworks, proposing a distributional conformal criterion that evaluates candidate predictions via leave-one-out assessment to ensure stability of the predictive distribution upon inclusion of a new observation. The approach introduces a distributed conformal criterion, establishes theoretical guarantees of bounded influence and local convexity, and demonstrates superiority over unbounded-influence plug-in predictors under ε-contamination models. In the case of linear posterior means, the method yields a closed-form solution with efficient online updates. Empirical results confirm its superior finite-sample performance across varying contamination levels, sample sizes, and prediction dimensions, corroborating its theoretical advantages.
📝 Abstract
We propose a general robust prediction framework, termed conformal-projective prediction (CPP), that integrates Bayesian predictive modeling with ideas from conformal prediction. Rather than assessing conformity through residual-based scores, the CPP criterion defines conformity distributionally: a candidate value for a future response is considered conforming to the extent that its inclusion in the data leaves the leave-one-out predictive distributions of the observed responses undisturbed. The framework requires only that the leave-one-out and swapped predictive distributions are available in closed form and that the swapped predictive mean is differentiable in the candidate value. Under these conditions, we establish a general bounded-influence proposition and a general local convexity lemma, and prove that CPP dominates any plug-in predictor with unbounded influence in asymptotic variance under $ε$-contamination models. When the posterior mean is linear in the observations, as in Gaussian linear models, basis-expansion regression, and Gaussian process regression, the swapped predictive mean is affine in the candidate value, yielding closed-form or one-dimensional optimization solutions and an efficient rank-two computational update; all general theoretical results specialize to explicit corollaries in this setting. Simulation experiments and two data analyses under the Gaussian linear model illustrate the finite-sample advantages of the proposed method, confirming the theoretical predictions across contamination levels, sample sizes, and predictor dimensions.