LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

📅 2024-04-04

🏛️ arXiv.org

📈 Citations: 8

✨ Influential: 1

career value

215K/year

🤖 AI Summary

Weak interpretability and low fidelity of attribution maps plague Vision Transformers (ViTs). To address this, we propose LeGrad—the first method to leverage gradients of layer-wise attention maps as the core explanatory signal. LeGrad generates high-fidelity attribution maps via weighted aggregation of multi-layer attention gradients, jointly modeling token responses from intermediate and final layers. It requires no architectural modification or additional training, effectively balancing low-level feature sensitivity with high-level semantic consistency. Evaluated on three challenging tasks—segmentation-based localization, perturbation robustness, and open-vocabulary explanation—LeGrad significantly outperforms state-of-the-art methods: it improves spatial localization accuracy by 12.6%, enhances noise robustness by 3.2×, and establishes an efficient, plug-and-play paradigm for ViT interpretability.

Technology Category

Application Category

📝 Abstract

Vision Transformers (ViTs), with their ability to model long-range dependencies through self-attention mechanisms, have become a standard architecture in computer vision. However, the interpretability of these models remains a challenge. To address this, we propose LeGrad, an explainability method specifically designed for ViTs. LeGrad computes the gradient with respect to the attention maps of ViT layers, considering the gradient itself as the explainability signal. We aggregate the signal over all layers, combining the activations of the last as well as intermediate tokens to produce the merged explainability map. This makes LeGrad a conceptually simple and an easy-to-implement tool for enhancing the transparency of ViTs. We evaluate LeGrad in challenging segmentation, perturbation, and open-vocabulary settings, showcasing its versatility compared to other SotA explainability methods demonstrating its superior spatial fidelity and robustness to perturbations. A demo and the code is available at https://github.com/WalBouss/LeGrad.

Problem

Research questions and friction points this paper is trying to address.

Visual Transformers

ViT Model

Interpretability

Innovation

Methods, ideas, or system contributions that make the work stand out.

LeGrad

Visual Transformer (ViT)

Robustness and Accuracy

🔎 Similar Papers

T-TAME: Trainable Attention Mechanism for Explaining Convolutional Networks and Vision Transformers