🤖 AI Summary
Conformal prediction often yields overly large and uninformative prediction sets, struggling to balance statistical validity with practical utility. To address this, we propose the first framework that systematically integrates test-time augmentation (TTA) into conformal prediction, introducing lightweight inductive biases during inference—without requiring model retraining—and maintaining compatibility with arbitrary conformal score functions (e.g., APS, RAPS) and adaptive quantile calibration. Our method preserves rigorous marginal coverage guarantees while substantially improving set informativeness and compactness. Extensive evaluation across three diverse datasets, three model architectures, and multiple distribution shifts demonstrates consistent improvements: average prediction set size is reduced by 10–14%, validating the framework’s generality and robustness.
📝 Abstract
A conformal classifier produces a set of predicted classes and provides a probabilistic guarantee that the set includes the true class. Unfortunately, it is often the case that conformal classifiers produce uninformatively large sets. In this work, we show that test-time augmentation (TTA)--a technique that introduces inductive biases during inference--reduces the size of the sets produced by conformal classifiers. Our approach is flexible, computationally efficient, and effective. It can be combined with any conformal score, requires no model retraining, and reduces prediction set sizes by 10%-14% on average. We conduct an evaluation of the approach spanning three datasets, three models, two established conformal scoring methods, different guarantee strengths, and several distribution shifts to show when and why test-time augmentation is a useful addition to the conformal pipeline.