🤖 AI Summary
To address declining model interpretability and consequent trust deficits in complex machine learning systems, this paper proposes a trustworthy XAI framework grounded in uncertainty decoupling. Methodologically, it is the first to disentangle aleatoric from epistemic uncertainty and leverages epistemic uncertainty as a reliability metric for explanations: (i) as a rejection threshold for low-fidelity explanations, and (ii) as a dynamic signal to adapt explanation strategies—e.g., feature attribution or counterfactual generation—to local uncertainty conditions. The framework integrates Bayesian neural networks, rigorous uncertainty quantification, and state-of-the-art XAI techniques. Extensive experiments across diverse models—including traditional machine learning and deep neural networks—demonstrate substantial improvements in explanation stability and robustness. Crucially, it effectively filters unreliable explanations, thereby enhancing user trust in AI-driven decisions.
📝 Abstract
Recent advancements in machine learning have emphasized the need for transparency in model predictions, particularly as interpretability diminishes when using increasingly complex architectures. In this paper, we propose leveraging prediction uncertainty as a complementary approach to classical explainability methods. Specifically, we distinguish between aleatoric (data-related) and epistemic (model-related) uncertainty to guide the selection of appropriate explanations. Epistemic uncertainty serves as a rejection criterion for unreliable explanations and, in itself, provides insight into insufficient training (a new form of explanation). Aleatoric uncertainty informs the choice between feature-importance explanations and counterfactual explanations. This leverages a framework of explainability methods driven by uncertainty quantification and disentanglement. Our experiments demonstrate the impact of this uncertainty-aware approach on the robustness and attainability of explanations in both traditional machine learning and deep learning scenarios.