🤖 AI Summary
The prevailing aleatoric-epistemic dichotomy inadequately captures the heterogeneous origins of uncertainty in machine learning, leading to theoretical ambiguity and practical misuse. This work identifies its core limitation: the failure to distinguish between data-generating mechanisms—such as intrinsic randomness, confounding, and measurement error—and model-specific cognitive limitations—including parameter, structural, and distributional uncertainty. To address this, we propose a formal uncertainty taxonomy grounded in probabilistic modeling and data-generating processes. The framework decomposes uncertainty into five interpretable, empirically distinguishable categories, each with rigorous conceptual definitions and mathematical characterizations. Compared to the traditional binary classification, our approach enhances theoretical rigor, expressive precision, and cross-disciplinary communicability in uncertainty modeling. It establishes a more principled foundation for trustworthy AI, robust decision-making, and uncertainty quantification.
📝 Abstract
The ideas of aleatoric and epistemic uncertainty are widely used to reason about the probabilistic predictions of machine-learning models. We identify incoherence in existing discussions of these ideas and suggest this stems from the aleatoric-epistemic view being insufficiently expressive to capture all of the distinct quantities that researchers are interested in. To explain and address this we derive a simple delineation of different model-based uncertainties and the data-generating processes associated with training and evaluation. Using this in place of the aleatoric-epistemic view could produce clearer discourse as the field moves forward.