Uncertainty Quantification with Proper Scoring Rules: Adjusting Measures to Prediction Tasks

📅 2025-05-28

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

This work addresses the lack of task adaptability in uncertainty quantification (UQ) across diverse downstream applications by proposing a task-adaptive UQ paradigm. Methodologically, it leverages a rigorous theoretical decomposition of proper scoring rules to establish, for the first time, a one-to-one correspondence between scoring rules and uncertainty types—aleatoric and epistemic—and decomposes total uncertainty into an entropy-divergence form. It then selects the optimal scoring rule by aligning it with the downstream task loss (e.g., 0–1 loss, log loss), enabling customized UQ metrics. Experiments demonstrate that loss-matching substantially improves selective prediction performance; mutual information—derived from a log-loss-aligned rule—achieves superior out-of-distribution detection; and epistemic uncertainty calibrated via 0–1 loss consistently outperforms baselines in active learning. The framework thus unifies UQ evaluation under task-specific objectives, enhancing both interpretability and empirical efficacy.

Technology Category

Application Category

📝 Abstract

We address the problem of uncertainty quantification and propose measures of total, aleatoric, and epistemic uncertainty based on a known decomposition of (strictly) proper scoring rules, a specific type of loss function, into a divergence and an entropy component. This leads to a flexible framework for uncertainty quantification that can be instantiated with different losses (scoring rules), which makes it possible to tailor uncertainty quantification to the use case at hand. We show that this flexibility is indeed advantageous. In particular, we analyze the task of selective prediction and show that the scoring rule should ideally match the task loss. In addition, we perform experiments on two other common tasks. For out-of-distribution detection, our results confirm that a widely used measure of epistemic uncertainty, mutual information, performs best. Moreover, in the setting of active learning, our measure of epistemic uncertainty based on the zero-one-loss consistently outperforms other uncertainty measures.

Problem

Research questions and friction points this paper is trying to address.

Proposes measures for total, aleatoric, and epistemic uncertainty using proper scoring rules

Tailors uncertainty quantification to specific use cases with flexible loss functions

Demonstrates advantages in selective prediction, out-of-distribution detection, and active learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Decompose proper scoring rules into divergence and entropy

Flexible framework with customizable scoring rules

Optimal scoring rule matches specific task loss

🔎 Similar Papers

Benchmarking Uncertainty Quantification Methods for Large Language Models with LM-Polygraph