Uncovering the Structure of Explanation Quality with Spectral Analysis

📅 2025-04-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing evaluation metrics for explainable AI (XAI) lack theoretical grounding and fail to reliably quantify explanation quality in high-stakes applications. Method: We propose the first spectral-analysis-based, structured framework for explanation quality assessment. By performing spectral decomposition on explanation result matrices, we mathematically decouple two orthogonal, intrinsic dimensions—stability (robustness to input perturbations) and target sensitivity (responsiveness to prediction targets)—establishing an interpretable, quantifiable 2D quality benchmark. Our approach integrates pixel-flipping experiments, information entropy analysis, and multi-scale validation across MNIST and ImageNet. Results: Empirical evaluation reveals that mainstream metrics systematically overlook the inherent trade-off between stability and target sensitivity. Our framework provides the first theoretically principled foundation for XAI evaluation, enabling rigorous comparison, diagnosis, and optimization of explanation methods.

Technology Category

Application Category

📝 Abstract
As machine learning models are increasingly considered for high-stakes domains, effective explanation methods are crucial to ensure that their prediction strategies are transparent to the user. Over the years, numerous metrics have been proposed to assess quality of explanations. However, their practical applicability remains unclear, in particular due to a limited understanding of which specific aspects each metric rewards. In this paper we propose a new framework based on spectral analysis of explanation outcomes to systematically capture the multifaceted properties of different explanation techniques. Our analysis uncovers two distinct factors of explanation quality-stability and target sensitivity-that can be directly observed through spectral decomposition. Experiments on both MNIST and ImageNet show that popular evaluation techniques (e.g., pixel-flipping, entropy) partially capture the trade-offs between these factors. Overall, our framework provides a foundational basis for understanding explanation quality, guiding the development of more reliable techniques for evaluating explanations.
Problem

Research questions and friction points this paper is trying to address.

Assessing unclear practical applicability of explanation quality metrics
Understanding specific aspects rewarded by different explanation metrics
Developing reliable techniques for evaluating explanation methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spectral analysis for explanation quality
Identifies stability and target sensitivity
Evaluates trade-offs in explanation techniques
🔎 Similar Papers
J
Johannes Maess
BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany; Machine Learning Group, TU Berlin, Berlin, Germany
G
G. Montavon
BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany; Charité – Universitätsmedizin Berlin, Germany
Shinichi Nakajima
Shinichi Nakajima
BIFOLD, Technische Universität Berlin
Machine Learning
K
Klaus-Robert Muller
BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany; Machine Learning Group, TU Berlin, Berlin, Germany; Department of Artificial Intelligence, Korea University, Seoul, Korea; Max-Planck Institute for Informatics, Saarbrücken, Germany
Thomas Schnake
Thomas Schnake
Technical University of Berlin
Machine Learning