🤖 AI Summary
This work addresses the issue of unreliable probability estimates in deep neural networks, which often exhibit overconfidence and lack effective mechanisms for class-adaptive uncertainty quantification. To this end, the authors formulate conformal prediction as an augmented Lagrangian optimization problem, enabling, for the first time, end-to-end learning of class-conditional prediction sets without any distributional assumptions. The proposed method dynamically optimizes the shape and size of prediction sets for each class, significantly enhancing their informativeness and practical utility. Experimental results across multiple image recognition and text classification benchmarks demonstrate that the resulting prediction sets are substantially smaller and more discriminative while rigorously maintaining the desired coverage guarantee.
📝 Abstract
Deep neural networks have achieved remarkable success across a variety of tasks, yet they often suffer from unreliable probability estimates. As a result, they can be overconfident in their predictions. Conformal Prediction (CP) offers a principled framework for uncertainty quantification, yielding prediction sets with rigorous coverage guarantees. Existing conformal training methods optimize for overall set size, but shaping the prediction sets in a class-conditional manner is not straightforward and typically requires prior knowledge of the data distribution. In this work, we introduce Class Adaptive Conformal Training (CaCT), which formulates conformal training as an augmented Lagrangian optimization problem that adaptively learns to shape prediction sets class-conditionally without making any distributional assumptions. Experiments on multiple benchmark datasets, including standard and long-tailed image recognition as well as text classification, demonstrate that CaCT consistently outperforms prior conformal training methods, producing significantly smaller and more informative prediction sets while maintaining the desired coverage guarantees.