On Spectral Properties of Gradient-based Explanation Methods

📅 2025-08-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The reliability of deep neural network interpretability methods is hindered by weak theoretical foundations and insufficient formalization. Method: This paper introduces a spectral-analytic perspective to systematically uncover a pervasive spectral bias in gradient-based attribution, establishing the first formal analytical framework that unifies probabilistic modeling with spectral theory. Building on this, it theoretically justifies key design choices—including squared gradients and input perturbations—and proposes a standardized perturbation scale mechanism and SpectralLens, a novel spectral-aware attribution aggregation method to enhance explanation consistency. Results: Extensive experiments validate the theoretical insights: SpectralLens significantly improves attribution stability and reliability across diverse architectures and datasets, yielding consistent, interpretable, and spectrally grounded feature attributions. The work provides both a rigorous spectral-theoretic foundation for explainable AI and a practical, deployable tool for robust model interpretation.

Technology Category

Application Category

📝 Abstract
Understanding the behavior of deep networks is crucial to increase our confidence in their results. Despite an extensive body of work for explaining their predictions, researchers have faced reliability issues, which can be attributed to insufficient formalism. In our research, we adopt novel probabilistic and spectral perspectives to formally analyze explanation methods. Our study reveals a pervasive spectral bias stemming from the use of gradient, and sheds light on some common design choices that have been discovered experimentally, in particular, the use of squared gradient and input perturbation. We further characterize how the choice of perturbation hyperparameters in explanation methods, such as SmoothGrad, can lead to inconsistent explanations and introduce two remedies based on our proposed formalism: (i) a mechanism to determine a standard perturbation scale, and (ii) an aggregation method which we call SpectralLens. Finally, we substantiate our theoretical results through quantitative evaluations.
Problem

Research questions and friction points this paper is trying to address.

Analyzes spectral bias in gradient-based explanation methods
Addresses inconsistency in perturbation hyperparameters like SmoothGrad
Proposes remedies: standard perturbation scale and SpectralLens
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adopt probabilistic and spectral perspectives
Characterize perturbation hyperparameters impact
Introduce SpectralLens aggregation method
🔎 Similar Papers
No similar papers found.