🤖 AI Summary
Reliable and universally applicable methods for quantifying predictive uncertainty remain lacking—especially in high-stakes applications. Method: We propose the first unified framework grounded in information-theoretic first principles, systematically classifying, deriving, and reconstructing uncertainty measures along two orthogonal dimensions: model type and approximation strategy for the true data distribution. Contribution/Results: We establish the first principled taxonomy that explicitly uncovers implicit assumptions, intrinsic relationships, and task-dependent behaviors of common measures—including entropy, mutual information, and KL divergence. Empirical evaluation across benchmark tasks—misclassification detection, selective prediction, and out-of-distribution (OOD) detection—demonstrates that no single measure dominates universally; performance varies significantly across tasks. This work provides both theoretical guidance and empirical evidence to inform principled uncertainty measure selection in safety-critical domains.
📝 Abstract
Reliable estimation of predictive uncertainty is crucial for machine learning applications, particularly in high-stakes scenarios where hedging against risks is essential. Despite its significance, a consensus on the correct measurement of predictive uncertainty remains elusive. In this work, we return to first principles to develop a fundamental framework of information-theoretic predictive uncertainty measures. Our proposed framework categorizes predictive uncertainty measures according to two factors: (I) The predicting model (II) The approximation of the true predictive distribution. Examining all possible combinations of these two factors, we derive a set of predictive uncertainty measures that includes both known and newly introduced ones. We empirically evaluate these measures in typical uncertainty estimation settings, such as misclassification detection, selective prediction, and out-of-distribution detection. The results show that no single measure is universal, but the effectiveness depends on the specific setting. Thus, our work provides clarity about the suitability of predictive uncertainty measures by clarifying their implicit assumptions and relationships.