Disentangling Visual Transformers: Patch-level Interpretability for Image Classification

📅 2025-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Vision Transformers (ViTs) achieve strong classification performance but suffer from poor interpretability, primarily because the global self-attention mechanism mixes patch-wise features indiscriminately, obscuring individual patch contributions. Method: We propose Hindered Transformer (HiT), an intrinsically interpretable ViT architecture that introduces a hindrance mechanism to suppress excessive cross-patch interactions, thereby decoupling global dependencies in self-attention. Classification is explicitly modeled as a linear combination of patch-wise contributions, and attribution-aligned training enforces strict correspondence between the classifier’s output logits and learned patch weights. Contribution/Results: HiT maintains near-ViT accuracy on ImageNet and other standard benchmarks while providing fine-grained, patch-level explanations without requiring post-hoc methods. Crucially, its explanations are rigorously verifiable—marking the first ViT design that jointly achieves high performance and intrinsic interpretability through principled architectural and training co-design.

Technology Category

Application Category

📝 Abstract
Visual transformers have achieved remarkable performance in image classification tasks, but this performance gain has come at the cost of interpretability. One of the main obstacles to the interpretation of transformers is the self-attention mechanism, which mixes visual information across the whole image in a complex way. In this paper, we propose Hindered Transformer (HiT), a novel interpretable by design architecture inspired by visual transformers. Our proposed architecture rethinks the design of transformers to better disentangle patch influences at the classification stage. Ultimately, HiT can be interpreted as a linear combination of patch-level information. We show that the advantages of our approach in terms of explicability come with a reasonable trade-off in performance, making it an attractive alternative for applications where interpretability is paramount.
Problem

Research questions and friction points this paper is trying to address.

Improving interpretability of visual transformers
Disentangling patch-level influences in image classification
Balancing performance and interpretability in transformers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hindered Transformer design
Patch-level interpretability focus
Linear combination interpretation method
🔎 Similar Papers
No similar papers found.
Guillaume Jeanneret
Guillaume Jeanneret
Postdoc at ISIR lab - Sorbonne Université
Computer VisionExplainable AICounterfactual Explanations
L
Loïc Simon
Normandy University, ENSICAEN, UNICAEN, CNRS, GREYC
F
Frédéric Jurie
Normandy University, ENSICAEN, UNICAEN, CNRS, GREYC