Empirical Bayes Conformal Prediction for Vision and Language Models

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

194K/year
🤖 AI Summary
This work addresses a key limitation of traditional conformal prediction, which relies on a single nonconformity score and often fails to distinguish signal from noise, thereby including high-variance erroneous candidates in the prediction set. The authors propose the first framework that integrates empirical Bayes with conformal prediction, introducing r-values to transform the variability of multiple nonconformity scores into uncertainty-aware scores that jointly account for both mean performance and uncertainty when evaluating candidates. The method guarantees the target coverage while effectively excluding high-variance incorrect candidates, and naturally reduces to standard conformal prediction when variability vanishes. Experiments demonstrate that the approach significantly improves ranking stability and yields smaller prediction sets across image classification, CLIP, and large language models, with particularly pronounced gains when score variability is discriminative.
📝 Abstract
Conformal prediction (CP) gives distribution-free coverage for modern vision and language models, but it is often forced to make a ranking decision from a single unstable nonconformity score. Standard CP uses one realization, while average-then-calibrate variants smooth multiple realizations into a point estimate. Both options discard the inconsistency that can help identify whether a candidate is indeed stable. A weak answer can enter the conformal set even if the evidence is not strong, simply because one posterior sample or prompt phrasing made it look strong. But variability can help distinguish a stable signal from noise-driven fluctuations. We describe an empirical Bayes conformal prediction framework that uses $r$-values to convert score variability into an uncertainty informed nonconformity score. The resulting $r$-value estimates how likely a candidate's latent score belongs to the top-ranked group after accounting for both its mean score and its uncertainty. It admits both a closed-form Normal-Normal empirical Bayes estimator and a nonparametric posterior-sampling estimator. Using the $r$-value as the nonconformity score preserves the target conformal coverage while provably reducing the inclusion of high variance false candidates under mild regularity conditions. Across image classification, CLIP-based VLM benchmarks, and LLMs, we show that $r$-value conformal prediction preserves target coverage while improving ranking stability and reducing set size when variability is informative, and reverting to CP-like behavior when variability vanishes.
Problem

Research questions and friction points this paper is trying to address.

Conformal Prediction
Nonconformity Score
Score Variability
Ranking Stability
Vision and Language Models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical Bayes
Conformal Prediction
r-values
Uncertainty Quantification
Vision-Language Models
🔎 Similar Papers
2024-08-29arXiv.orgCitations: 7