EPIC: Explanation of Pretrained Image Classification Networks via Prototype

📅 2025-05-19

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing XAI methods for explaining pre-trained image classification models face a fundamental trade-off: post-hoc attribution techniques are model-agnostic but yield coarse-grained explanations, while prior prototype-based approaches offer intuitive, part-level interpretations yet require architecture modification and full model retraining. This work introduces the first *post-hoc, prototype-level explanation method* that operates without altering the target model or performing any retraining—enabling fine-grained, prototype-driven attribution for black-box classifiers. Our approach integrates prototype retrieval and matching, feature-space projection, class-aware saliency guidance, and multi-scale patch embedding. Evaluated on CUB-200, Stanford Cars, and ImageNet, it generates semantically coherent, human-interpretable prototypes, substantially outperforming state-of-the-art post-hoc methods. By achieving both broad applicability and perceptually grounded interpretability, our method bridges the critical gap between generality and explanatory intuitiveness in XAI.

Technology Category

Application Category

📝 Abstract

Explainable AI (XAI) methods generally fall into two categories. Post-hoc approaches generate explanations for pre-trained models and are compatible with various neural network architectures. These methods often use feature importance visualizations, such as saliency maps, to indicate which input regions influenced the model's prediction. Unfortunately, they typically offer a coarse understanding of the model's decision-making process. In contrast, ante-hoc (inherently explainable) methods rely on specially designed model architectures trained from scratch. A notable subclass of these methods provides explanations through prototypes, representative patches extracted from the training data. However, prototype-based approaches have limitations: they require dedicated architectures, involve specialized training procedures, and perform well only on specific datasets. In this work, we propose EPIC (Explanation of Pretrained Image Classification), a novel approach that bridges the gap between these two paradigms. Like post-hoc methods, EPIC operates on pre-trained models without architectural modifications. Simultaneously, it delivers intuitive, prototype-based explanations inspired by ante-hoc techniques. To the best of our knowledge, EPIC is the first post-hoc method capable of fully replicating the core explanatory power of inherently interpretable models. We evaluate EPIC on benchmark datasets commonly used in prototype-based explanations, such as CUB-200-2011 and Stanford Cars, alongside large-scale datasets like ImageNet, typically employed by post-hoc methods. EPIC uses prototypes to explain model decisions, providing a flexible and easy-to-understand tool for creating clear, high-quality explanations.

Problem

Research questions and friction points this paper is trying to address.

Bridges post-hoc and ante-hoc XAI methods for image classification

Explains pre-trained models via prototypes without architecture changes

Provides intuitive, flexible prototype-based explanations for diverse datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bridges post-hoc and ante-hoc XAI methods

Uses prototypes for model decision explanations

Works with pre-trained models without modifications

🔎 Similar Papers

No similar papers found.

Authors to Follow