P-TAME: Explain Any Image Classifier with Trained Perturbations

📅 2025-01-29

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Deep neural networks suffer from limited interpretability in critical tasks such as image recognition due to their “black-box” nature. To address this, we propose P-TAME—a model-agnostic, single-forward-pass method for generating high-resolution attention maps. Our approach introduces an end-to-end differentiable **trainable perturbation module**, replacing handcrafted perturbations, and integrates an auxiliary classifier with gradient-driven attention optimization to enable feature-distillation-style explanations without accessing the target model’s internal parameters. Evaluated on VGG-16, ResNet-50, and ViT-B-16, P-TAME consistently outperforms mainstream methods—including Grad-CAM and Score-CAM—achieving state-of-the-art performance on quantitative metrics such as Faithfulness and Localization. Moreover, it supports plug-and-play deployment across diverse architectures, requiring no architectural modifications or retraining of the target model.

Technology Category

Application Category

📝 Abstract

The adoption of Deep Neural Networks (DNNs) in critical fields where predictions need to be accompanied by justifications is hindered by their inherent black-box nature. In this paper, we introduce P-TAME (Perturbation-based Trainable Attention Mechanism for Explanations), a model-agnostic method for explaining DNN-based image classifiers. P-TAME employs an auxiliary image classifier to extract features from the input image, bypassing the need to tailor the explanation method to the internal architecture of the backbone classifier being explained. Unlike traditional perturbation-based methods, which have high computational requirements, P-TAME offers an efficient alternative by generating high-resolution explanations in a single forward pass during inference. We apply P-TAME to explain the decisions of VGG-16, ResNet-50, and ViT-B-16, three distinct and widely used image classifiers. Quantitative and qualitative results show that our method matches or outperforms previous explainability methods, including model-specific approaches. Code and trained models will be released upon acceptance.

Problem

Research questions and friction points this paper is trying to address.

Deep Neural Networks

Interpretability

Image Recognition

Innovation

Methods, ideas, or system contributions that make the work stand out.

P-TAME

Deep Neural Networks Interpretability

Model-Agnostic Explanation

🔎 Similar Papers

No similar papers found.

Authors to Follow