No Single Metric Tells the Whole Story: A Multi-Dimensional Evaluation Framework for Uncertainty Attributions

📅 2026-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the lack of a unified evaluation framework for uncertainty attribution methods, which hinders meaningful cross-method comparisons. To this end, we adapt the Co-12 framework from explainable AI (XAI) to the uncertainty attribution setting, introducing a multidimensional evaluation protocol that encompasses correctness, consistency, continuity, compactness, and a newly proposed dimension—communicativeness—which quantifies an attribution method’s sensitivity to epistemic uncertainty. Leveraging uncertainty quantification techniques such as Monte Carlo Dropout and DropConnect in conjunction with gradient- and perturbation-based attribution methods, we conduct a systematic assessment across tabular and image datasets using eight distinct metrics. Our experiments reveal that gradient-based approaches consistently outperform perturbation-based ones in terms of consistency and communicativeness, with Monte Carlo DropConnect emerging as the overall best-performing strategy. Importantly, no single metric suffices to holistically evaluate attribution quality.

Technology Category

Application Category

📝 Abstract
Research on explainable AI (XAI) has frequently focused on explaining model predictions. More recently, methods have been proposed to explain prediction uncertainty by attributing it to input features (uncertainty attributions). However, the evaluation of these methods remains inconsistent as studies rely on heterogeneous proxy tasks and metrics, hindering comparability. We address this by aligning uncertainty attributions with the well-established Co-12 framework for XAI evaluation. We propose concrete implementations for the correctness, consistency, continuity, and compactness properties. Additionally, we introduce conveyance, a property tailored to uncertainty attributions that evaluates whether controlled increases in epistemic uncertainty reliably propagate to feature-level attributions. We demonstrate our evaluation framework with eight metrics across combinations of uncertainty quantification and feature attribution methods on tabular and image data. Our experiments show that gradient-based methods consistently outperform perturbation-based approaches in consistency and conveyance, while Monte-Carlo dropconnect outperforms Monte-Carlo dropout in most metrics. Although most metrics rank the methods consistently across samples, inter-method agreement remains low. This suggests no single metric sufficiently evaluates uncertainty attribution quality. The proposed evaluation framework contributes to the body of knowledge by establishing a foundation for systematic comparison and development of uncertainty attribution methods.
Problem

Research questions and friction points this paper is trying to address.

uncertainty attributions
evaluation framework
explainable AI
metric comparability
XAI evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

uncertainty attribution
multi-dimensional evaluation
conveyance
explainable AI
epistemic uncertainty
🔎 Similar Papers
No similar papers found.
E
Emily Schiller
XITASO GmbH IT & Software Solutions, The Artificial Intelligence and Cognitive Load Research Lab, University College Cork, Augsburg, Germany
T
Teodor Chiaburu
Berliner Hochschule für Technik, Berlin, Germany
Marco Zullich
Marco Zullich
University of Groningen
Deep learning
Luca Longo
Luca Longo
University College Cork; Trinity College Dublin
Explainable AI (XAI)ArgumentationCognitive LoadExplainable Artificial IntelligenceNeural Eng