🤖 AI Summary
This work investigates the fundamental relationship between conformal prediction (CP) and Bayesian inference, challenging the common misconception that CP implicitly performs Bayesian conditioning. Using tools from measurability theory, finitely/countably additive probability, Blackwell experiment comparison, and Le Cam deficiency analysis, we establish—systematically for the first time—four formal semantic separations between CP and Bayesian prediction: (i) CP violates conditional extendability; (ii) it is non-combinable across data sources; (iii) it cannot induce regular conditional distributions; and (iv) it lacks a countably additive kernel structure. Our analysis reveals that CP fundamentally belongs to the Fisher–Dempster–Hill tradition of rank-calibration methods, and thus lacks compositional semantics for sequential updating, downstream decision-making, and prediction-driven inference—exhibiting susceptibility to Dutch book arguments. This work rigorously clarifies CP’s non-Bayesian foundations, providing precise theoretical grounding for its scope and limitations.
📝 Abstract
Conformal prediction (CP) is widely presented as distribution-free predictive inference with finite-sample marginal coverage under exchangeability. We argue that CP is best understood as a rank-calibrated descendant of the Fisher-Dempster-Hill fiducial/direct-probability tradition rather than as Bayesian conditioning in disguise.
We establish four separations from coherent countably additive predictive semantics. First, canonical conformal constructions violate conditional extensionality: prediction sets can depend on the marginal design P(X) even when P(Y|X) is fixed. Second, any finitely additive sequential extension preserving rank calibration is nonconglomerable, implying countable Dutch-book vulnerabilities. Third, rank-calibrated updates cannot be realized as regular conditionals of any countably additive exchangeable law on Y^infty. Fourth, formalizing both paradigms as families of one-step predictive kernels, conformal and Bayesian kernels coincide only on a Baire-meagre subset of the space of predictive laws.
We further show that rank- and proxy-based reductions are generically Blackwell-deficient relative to full-data experiments, yielding positive Le Cam deficiency for suitable losses. Extending the analysis to prediction-powered inference (PPI) yields an analogous message: bias-corrected, proxy-rectified estimators can be valid as confidence devices while failing to define transportable belief states across stages, shifts, or adaptive selection. Together, the results sharpen a general limitation of wrappers: finite-sample calibration guarantees do not by themselves supply composable semantics for sequential updating or downstream decision-making.