🤖 AI Summary
Explainability of deep learning image classifiers suffers from a trade-off: pixel-level attribution maps (e.g., saliency maps) lack generalizability, while global explanations oversimplify model behavior. To bridge this gap, we propose Segment Attribution Tables (SATs), which align saliency maps with semantic segmentation masks to aggregate attributions into named, semantically meaningful image segments (e.g., “eye”, “background”). SATs operate post-hoc—requiring no model modification—and are compatible with any saliency method. They quantify model reliance on semantic regions, enabling stable detection of spurious correlations (e.g., watermarks, background biases) and redundant features. Experiments demonstrate that SATs effectively expose model biases even when out-of-distribution performance remains largely intact—uncovering flaws undetected by conventional metrics. By providing interpretable, mid-granularity insights, SATs serve as a practical diagnostic and debugging tool for model behavior analysis.
📝 Abstract
Deep learning dominates image classification tasks, yet understanding how models arrive at predictions remains a challenge. Much research focuses on local explanations of individual predictions, such as saliency maps, which visualise the influence of specific pixels on a model's prediction. However, reviewing many of these explanations to identify recurring patterns is infeasible, while global methods often oversimplify and miss important local behaviours. To address this, we propose Segment Attribution Tables (SATs), a method for summarising local saliency explanations into (semi-)global insights. SATs take image segments (such as "eyes" in Chihuahuas) and leverage saliency maps to quantify their influence. These segments highlight concepts the model relies on across instances and reveal spurious correlations, such as reliance on backgrounds or watermarks, even when out-of-distribution test performance sees little change. SATs can explain any classifier for which a form of saliency map can be produced, using segmentation maps that provide named segments. SATs bridge the gap between oversimplified global summaries and overly detailed local explanations, offering a practical tool for analysing and debugging image classifiers.