Explaining the Impact of Training on Vision Models via Activation Clustering

📅 2024-11-29
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates how training data distribution, supervision strength, and model architecture (e.g., ViT Registers) jointly shape the semantic structure and information flow of visual representations. To this end, we propose NAVE—a novel unsupervised activation clustering method leveraging frozen encoders—to systematically disentangle the individual and interactive effects of these factors on concept acquisition. Our experiments reveal: (1) supervision level and data distribution critically reconfigure high-level semantic organization; (2) ViT Registers non-redundantly enhance local–global information integration; and (3) implicit dataset biases (Clever Hans effects) induce feature-space saturation and representation degradation. NAVE establishes a quantitative, attribution-aware analytical framework for visual representation formation, enabling concept-level interpretability assessment and architecture-specific diagnostic analysis.

Technology Category

Application Category

📝 Abstract
Recent developments in the field of explainable artificial intelligence (XAI) for vision models investigate the information extracted by their feature encoder. We contribute to this effort and propose Neuro-Activated Vision Explanations (NAVE), which extracts the information captured by the encoder by clustering the feature activations of the frozen network to be explained. The method does not aim to explain the model's prediction but to answer questions such as which parts of the image are processed similarly or which information is kept in deeper layers. Experimentally, we leverage NAVE to show that the training dataset and the level of supervision affect which concepts are captured. In addition, our method reveals the impact of registers on vision transformers (ViT) and the information saturation caused by the watermark Clever Hans effect in the training set.
Problem

Research questions and friction points this paper is trying to address.

Explain vision model training impact
Analyze feature activation clustering
Investigate training dataset influence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Clusters feature activations
Evaluates training dataset impact
Measures information saturation effects
🔎 Similar Papers
No similar papers found.
Ahcène Boubekki
Ahcène Boubekki
Physikalisch-Technische Bundesanstalt, Berlin, Germany
XAI Representation Learning
S
Samuel G. Fadel
Linköping University, Sweden
S
Sebastian Mair
Linköping University & Uppsala University, Sweden