Truthfulness of Decision-Theoretic Calibration Measures

📅 2025-03-04

📈 Citations: 0

✨ Influential: 0

career value

158K/year

🤖 AI Summary

Existing calibration metrics fundamentally fail to reconcile decision rationality (i.e., no-regret decisions) with truthfulness (i.e., minimizing expected error in probability reporting). Method: We propose StepCEˢᵘᵇ, the first calibration metric simultaneously satisfying both properties. It integrates a stepwise loss design, subsampling, and probabilistic perturbation analysis to construct a complete, decision-rational, and approximately truthful calibration measure. Contribution/Results: We prove that any complete and decision-rational calibration metric must be discontinuous and untruthful in non-smooth settings. StepCEˢᵘᵇ achieves an $O(1)$ truthfulness guarantee under product distributions and $O(sqrt{log(1/c)})$ under $c$-smoothness—significantly improving upon prior exponential or polynomial bounds. This resolves a long-standing gap in calibration theory by enabling unified modeling of rationality and truthfulness.

Technology Category

Application Category

📝 Abstract

Calibration measures quantify how much a forecaster's predictions violates calibration, which requires that forecasts are unbiased conditioning on the forecasted probabilities. Two important desiderata for a calibration measure are its decision-theoretic implications (i.e., downstream decision-makers that best-respond to the forecasts are always no-regret) and its truthfulness (i.e., a forecaster approximately minimizes error by always reporting the true probabilities). Existing measures satisfy at most one of the properties, but not both. We introduce a new calibration measure termed subsampled step calibration, $mathsf{StepCE}^{ extsf{sub}}$, that is both decision-theoretic and truthful. In particular, on any product distribution, $mathsf{StepCE}^{ extsf{sub}}$ is truthful up to an $O(1)$ factor whereas prior decision-theoretic calibration measures suffer from an $e^{-Omega(T)}$-$Omega(sqrt{T})$ truthfulness gap. Moreover, in any smoothed setting where the conditional probability of each event is perturbed by a noise of magnitude $c>0$, $mathsf{StepCE}^{ extsf{sub}}$ is truthful up to an $O(sqrt{log(1/c)})$ factor, while prior decision-theoretic measures have an $e^{-Omega(T)}$-$Omega(T^{1/3})$ truthfulness gap. We also prove a general impossibility result for truthful decision-theoretic forecasting: any complete and decision-theoretic calibration measure must be discontinuous and non-truthful in the non-smoothed setting.

Problem

Research questions and friction points this paper is trying to address.

Develops a calibration measure satisfying decision-theoretic and truthful properties.

Addresses gaps in existing measures' truthfulness and decision-theoretic implications.

Proves impossibility of truthful decision-theoretic forecasting in non-smoothed settings.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces subsampled step calibration measure

Ensures decision-theoretic and truthful properties

Addresses gaps in existing calibration measures

🔎 Similar Papers

Calibration in Deep Learning: A Survey of the State-of-the-Art