A Variational Estimator for $L_p$ Calibration Errors

📅 2026-02-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of accurately estimating calibration error in multi-class settings when defined via $L_p$ divergences, a task where existing methods often suffer from overestimation and fail to distinguish between over- and under-confidence. The authors propose a novel calibration error estimation approach grounded in a variational framework, which—unlike prior techniques limited to divergences induced by proper losses—generalizes for the first time to arbitrary $L_p$ divergences. This advancement effectively mitigates overestimation and enables clear separation of directional confidence biases in model predictions. Extensive experiments demonstrate the method’s superior performance across diverse settings, significantly enhancing both the accuracy and reliability of calibration error estimation. The implementation has been integrated into the open-source toolkit probmetrics.

Technology Category

Application Category

📝 Abstract
Calibration$\unicode{x2014}$the problem of ensuring that predicted probabilities align with observed class frequencies$\unicode{x2014}$is a basic desideratum for reliable prediction with machine learning systems. Calibration error is traditionally assessed via a divergence function, using the expected divergence between predictions and empirical frequencies. Accurately estimating this quantity is challenging, especially in the multiclass setting. Here, we show how to extend a recent variational framework for estimating calibration errors beyond divergences induced induced by proper losses, to cover a broad class of calibration errors induced by $L_p$ divergences. Our method can separate over- and under-confidence and, unlike non-variational approaches, avoids overestimation. We provide extensive experiments and integrate our code in the open-source package probmetrics (https://github.com/dholzmueller/probmetrics) for evaluating calibration errors.
Problem

Research questions and friction points this paper is trying to address.

calibration error
L_p divergence
multiclass classification
probability calibration
estimation
Innovation

Methods, ideas, or system contributions that make the work stand out.

variational estimation
L_p calibration error
probability calibration
overconfidence detection
probmetrics