🤖 AI Summary
This work addresses a fundamental limitation of existing Class Activation Mapping (CAM)-based explanation methods, which assume that feature map activations at a given spatial location are influenced solely by the corresponding local image region—an assumption that breaks down in deep neural networks due to extensive receptive fields and long-range dependencies. To overcome this, the paper proposes FAME, a novel attribution method that integrates gradient information with input perturbation mechanisms to generate pixel-level explanations in a gradient-driven manner, thereby transcending the constraints of traditional CAM approaches. FAME provides the first systematic empirical validation of the failure of the local correspondence assumption in deep models and demonstrates state-of-the-art or competitive performance against current leading explanation techniques, both qualitatively in visualizations and quantitatively across image classification and face recognition benchmarks.
📝 Abstract
Deep Learning has revolutionized machine learning, reaching unprecedented levels of accuracy, but at the cost of reduced interpretability. Especially in image processing systems, deep networks transform local pixel information into more global concepts in a highly obscured manner. Explainable AI methods for image processing try to shed light on this issue by highlighting the regions of the image that are important for the prediction task. Among these, Class Activation Mapping (CAM) and its gradient-based variants compute attributions based on the feature map and upscale them to the image resolution, assuming that feature map locations are influenced only by underlying regions. Perturbation-based methods, such as CorrRISE, on the other hand, try to provide pixel-level attributions by perturbing the input with fixed patches and checking how the output of the network changes. In this work, we propose Feature Activation Map Explanation (FAME), which combines both worlds by using network gradients to compute changes to the input image, manipulating it in a gradient-driven way rather than using fixed patches. We apply this technique on two common tasks, image classification and face recognition, and show that CAM's above-mentioned assumption does not hold for deeper networks. We qualitatively and quantitively show that FAME produces attribution maps that are competitive state-of-the-art systems. Our code is available: {\footnotesize https://github.com/AIML-IfI/fame.}