🤖 AI Summary
Uncertainty quantification (UQ) in machine learning suffers from terminological ambiguity and strong contextual dependence, causing a critical misalignment between modeling intent and technical implementation. To address this, we integrate philosophical epistemology, interdisciplinary terminology science, a taxonomy of UQ methodologies, and empirical simulation-based inference (SBI) studies. Our analysis systematically maps inferential objectives—such as prediction, parameter estimation, and decision support—to statistical paradigms (e.g., frequentist, Bayesian, and distribution-free approaches). We introduce the novel “intent–implementation alignment” principle and propose a multidimensional trustworthiness framework encompassing reproducibility, interpretability, and context-aware adaptability. Furthermore, we deliver standardized guidelines for UQ design and evaluation tailored to scientific ML applications. The resulting framework significantly enhances the rigor, transparency, and reliability of uncertainty statements, enabling principled, context-sensitive UQ practice across domains.
📝 Abstract
Quantifying uncertainties for machine learning (ML) models is a foundational challenge in modern data analysis. This challenge is compounded by at least two key aspects of the field: (a) inconsistent terminology surrounding uncertainty and estimation across disciplines, and (b) the varying technical requirements for establishing trustworthy uncertainties in diverse problem contexts. In this position paper, we aim to clarify the depth of these challenges by identifying these inconsistencies and articulating how different contexts impose distinct epistemic demands. We examine the current landscape of estimation targets (e.g., prediction, inference, simulation-based inference), uncertainty constructs (e.g., frequentist, Bayesian, fiducial), and the approaches used to map between them. Drawing on the literature, we highlight and explain examples of problematic mappings. To help address these issues, we advocate for standards that promote alignment between the extit{intent} and extit{implementation} of uncertainty quantification (UQ) approaches. We discuss several axes of trustworthiness that are necessary (if not sufficient) for reliable UQ in ML models, and show how these axes can inform the design and evaluation of uncertainty-aware ML systems. Our practical recommendations focus on scientific ML, offering illustrative cases and use scenarios, particularly in the context of simulation-based inference (SBI).