Exposing Image Classifier Shortcuts with Counterfactual Frequency (CoF) Tables

📅 2024-05-24

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

182K/year

🤖 AI Summary

Image classification models often rely on spurious correlations—such as watermarks, background textures, or artifacts—rather than semantically meaningful features, undermining generalization. To address this, we propose the Counterfactual Frequency Table (CoF Table), the first method to automatically synthesize instance-level counterfactual explanations into a global shortcut pattern map. CoF Table leverages superpixel segmentation and semantic segmentation annotations to perform concept-level attribution aggregation, enabling both localization and quantification of prevalent shortcuts—including copyright watermarks, snowy backgrounds, and ink stains—without requiring external validation data. Evaluated across multiple standard benchmarks, our approach significantly improves the efficiency and interpretability of shortcut diagnosis, reduces manual auditing effort, and establishes a novel paradigm for robustness assessment of vision models.

Technology Category

Application Category

📝 Abstract

The rise of deep learning in image classification has brought unprecedented accuracy but also highlighted a key issue: the use of 'shortcuts' by models. Such shortcuts are easy-to-learn patterns from the training data that fail to generalise to new data. Examples include the use of a copyright watermark to recognise horses, snowy background to recognise huskies, or ink markings to detect malignant skin lesions. The explainable AI (XAI) community has suggested using instance-level explanations to detect shortcuts without external data, but this requires the examination of many explanations to confirm the presence of such shortcuts, making it a labour-intensive process. To address these challenges, we introduce Counterfactual Frequency (CoF) tables, a novel approach that aggregates instance-based explanations into global insights, and exposes shortcuts. The aggregation implies the need for some semantic concepts to be used in the explanations, which we solve by labelling the segments of an image. We demonstrate the utility of CoF tables across several datasets, revealing the shortcuts learned from them.

Problem

Research questions and friction points this paper is trying to address.

Shortcut Problem

Deep Learning

Image Classification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Counterfactual Frequencies

Explainable AI (XAI)

Misleading Features Identification

🔎 Similar Papers

Identifying Spurious Correlations using Counterfactual Alignment