Improving Perturbation-based Explanations by Understanding the Role of Uncertainty Calibration

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work identifies insufficient uncertainty calibration as a critical impediment to perturbation-based explanation methods (e.g., LIME, SHAP): systematic distortion in model probability estimates under explanatory perturbations degrades both the reliability and stability of local and global interpretations. To address this, we establish—for the first time—theoretical links between calibration quality and explanation quality. We then propose ReCalX, a novel recalibration framework specifically designed for explanation scenarios, which optimizes output confidence under perturbations without altering original predictions. Extensive experiments demonstrate that ReCalX significantly reduces perturbation-specific calibration error and consistently improves feature importance identification accuracy and explanation robustness across diverse models and datasets. By providing a verifiable, explanation-aware calibration paradigm, ReCalX advances the foundation for trustworthy model interpretation.

Technology Category

Application Category

📝 Abstract

Perturbation-based explanations are widely utilized to enhance the transparency of machine-learning models in practice. However, their reliability is often compromised by the unknown model behavior under the specific perturbations used. This paper investigates the relationship between uncertainty calibration - the alignment of model confidence with actual accuracy - and perturbation-based explanations. We show that models systematically produce unreliable probability estimates when subjected to explainability-specific perturbations and theoretically prove that this directly undermines global and local explanation quality. To address this, we introduce ReCalX, a novel approach to recalibrate models for improved explanations while preserving their original predictions. Empirical evaluations across diverse models and datasets demonstrate that ReCalX consistently reduces perturbation-specific miscalibration most effectively while enhancing explanation robustness and the identification of globally important input features.

Problem

Research questions and friction points this paper is trying to address.

Investigating how uncertainty calibration affects perturbation-based explanation reliability

Addressing model miscalibration caused by explainability-specific perturbations

Improving explanation robustness through recalibration while preserving predictions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Recalibrating models to improve explanation reliability

Addressing perturbation-induced uncertainty calibration issues

Preserving original predictions while enhancing explanation robustness

🔎 Similar Papers

Uncertainty Quantification for Gradient-based Explanations in Neural Networks