Online Conformal Prediction with Adversarial Semi-bandit Feedback via Regret Minimization

📅 2026-04-20

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

This work addresses the challenge of maintaining reliable coverage in online uncertainty quantification when ground-truth labels are only revealed for instances within the predicted set—a semi-supervised adversarial setting. It formulates online conformal prediction as an adversarial multi-armed bandit problem, where candidate prediction sets serve as actions, and constructs prediction sets dynamically by minimizing regret. The proposed method incorporates an adversarial partial-feedback mechanism that operates without distributional assumptions, enabling adaptation to fully adversarial environments. Theoretically, it establishes a formal connection between coverage guarantees and the learner’s regret. Empirical evaluations demonstrate that the framework effectively controls miscoverage rates under both i.i.d. and non-i.i.d. data while maintaining reasonably sized prediction sets.

Technology Category

Application Category

📝 Abstract

Uncertainty quantification is crucial in safety-critical systems, where decisions must be made under uncertainty. In particular, we consider the problem of online uncertainty quantification, where data points arrive sequentially. Online conformal prediction is a principled online uncertainty quantification method that dynamically constructs a prediction set at each time step. While existing methods for online conformal prediction provide long-run coverage guarantees without any distributional assumptions, they typically assume a full feedback setting in which the true label is always observed. In this paper, we propose a novel learning method for online conformal prediction with partial feedback from an adaptive adversary-a more challenging setup where the true label is revealed only when it lies inside the constructed prediction set. Specifically, we formulate online conformal prediction as an adversarial bandit problem by treating each candidate prediction set as an arm. Building on an existing algorithm for adversarial bandits, our method achieves a long-run coverage guarantee by explicitly establishing its connection to the regret of the learner. Finally, we empirically demonstrate the effectiveness of our method in both independent and identically distributed (i.i.d.) and non-i.i.d. settings, showing that it successfully controls the miscoverage rate while maintaining a reasonable size of the prediction set.

Problem

Research questions and friction points this paper is trying to address.

online conformal prediction

adversarial feedback

partial feedback

uncertainty quantification

semi-bandit

Innovation

Methods, ideas, or system contributions that make the work stand out.

online conformal prediction

adversarial bandits

partial feedback