Why Machine Learning Models Fail to Fully Capture Epistemic Uncertainty

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

Existing second-order uncertainty modeling methods systematically underestimate epistemic uncertainty, primarily because they neglect model bias—a critical source of epistemic uncertainty—and erroneously attribute its error to aleatoric uncertainty. Method: This work explicitly incorporates model bias into the epistemic uncertainty taxonomy, proposing a fine-grained classification framework and an extended bias–variance decomposition theory. We further design a synthetic-data-based evaluation protocol to simulate multi-source epistemic uncertainty. Contribution/Results: We theoretically prove that high model bias leads to substantial underestimation of epistemic uncertainty. Empirical results show that mainstream methods underestimate epistemic uncertainty by 42%–68% on average; moreover, aleatoric uncertainty estimates are reliable only when model bias is well-controlled. This work establishes a more rigorous foundation for epistemic uncertainty modeling in trustworthy machine learning.

Technology Category

Application Category

📝 Abstract

In recent years various supervised learning methods that disentangle aleatoric and epistemic uncertainty based on second-order distributions have been proposed. We argue that these methods fail to capture critical components of epistemic uncertainty, particularly due to the often-neglected component of model bias. To show this, we make use of a more fine-grained taxonomy of epistemic uncertainty sources in machine learning models, and analyse how the classical bias-variance decomposition of the expected prediction error can be decomposed into different parts reflecting these uncertainties. By using a simulation-based evaluation protocol which encompasses epistemic uncertainty due to both procedural- and data-driven uncertainty components, we illustrate that current methods rarely capture the full spectrum of epistemic uncertainty. Through theoretical insights and synthetic experiments, we show that high model bias can lead to misleadingly low estimates of epistemic uncertainty, and common second-order uncertainty quantification methods systematically blur bias-induced errors into aleatoric estimates, thereby underrepresenting epistemic uncertainty. Our findings underscore that meaningful aleatoric estimates are feasible only if all relevant sources of epistemic uncertainty are properly represented.

Problem

Research questions and friction points this paper is trying to address.

Current methods fail to capture full epistemic uncertainty spectrum

Model bias leads to misleadingly low epistemic uncertainty estimates

Second-order methods blur bias-induced errors into aleatoric estimates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Second-order distributions disentangle uncertainty types

Simulation protocol evaluates epistemic uncertainty sources

Bias-variance decomposition reveals uncertainty underestimation

🔎 Similar Papers

(Implicit) Ensembles of Ensembles: Epistemic Uncertainty Collapse in Large Models