Representational Difference Explanations

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing XAI methods struggle to unsupervisedly and interpretable compare semantic differences between two learned representations. This paper introduces RDX—the first interpretable, label-free, gradient-free framework for representation difference analysis. RDX leverages canonical correlation analysis (CCA) for representation alignment and constructs difference saliency maps to localize concept-level discrepancies. It integrates heatmap visualization with concept attribution analysis, supporting mainstream architectures including CNNs and Transformers. Evaluated on ImageNet and iNaturalist subsets, RDX successfully uncovers inter-model semantic biases and latent data patterns. In controlled experiments, it accurately recovers pre-specified conceptual differences, significantly outperforming baselines such as Grad-CAM and SHAP. By bridging a critical gap in interpretable representation comparison, RDX establishes a novel paradigm for deep model diagnosis and evolutionary analysis.

Technology Category

Application Category

📝 Abstract

We propose a method for discovering and visualizing the differences between two learned representations, enabling more direct and interpretable model comparisons. We validate our method, which we call Representational Differences Explanations (RDX), by using it to compare models with known conceptual differences and demonstrate that it recovers meaningful distinctions where existing explainable AI (XAI) techniques fail. Applied to state-of-the-art models on challenging subsets of the ImageNet and iNaturalist datasets, RDX reveals both insightful representational differences and subtle patterns in the data. Although comparison is a cornerstone of scientific analysis, current tools in machine learning, namely post hoc XAI methods, struggle to support model comparison effectively. Our work addresses this gap by introducing an effective and explainable tool for contrasting model representations.

Problem

Research questions and friction points this paper is trying to address.

Discovering differences between learned representations for interpretable comparisons

Addressing limitations of existing XAI methods in model comparison

Providing an explainable tool for contrasting model representations effectively

Innovation

Methods, ideas, or system contributions that make the work stand out.

Discovering differences between learned representations

Visualizing model comparisons interpretably

Effective tool for contrasting model representations

🔎 Similar Papers

Dimensions underlying the representational alignment of deep neural networks with humans