Representation Understanding via Activation Maximization

📅 2025-08-10

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses the limited interpretability of feature representations in deep neural networks (DNNs). We propose a unified visualization framework based on activation maximization—first systematically extended to intermediate layers of both convolutional neural networks (CNNs) and vision transformers (ViTs). By integrating gradient-based optimization with multi-scale regularization, our method generates semantically meaningful visualizations that elucidate the hierarchical evolution of layer-wise neuronal representations. Furthermore, we leverage the same framework to synthesize high-fidelity adversarial examples, thereby characterizing model decision boundaries and exposing structural vulnerabilities. Experiments demonstrate strong generalizability and interpretability across CNNs and ViTs, validating the framework’s effectiveness in revealing internal representational semantics and failure modes. This approach establishes a novel paradigm for model diagnosis and robustness analysis.

Technology Category

Application Category

📝 Abstract

Understanding internal feature representations of deep neural networks (DNNs) is a fundamental step toward model interpretability. Inspired by neuroscience methods that probe biological neurons using visual stimuli, recent deep learning studies have employed Activation Maximization (AM) to synthesize inputs that elicit strong responses from artificial neurons. In this work, we propose a unified feature visualization framework applicable to both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). Unlike prior efforts that predominantly focus on the last output-layer neurons in CNNs, we extend feature visualization to intermediate layers as well, offering deeper insights into the hierarchical structure of learned feature representations. Furthermore, we investigate how activation maximization can be leveraged to generate adversarial examples, revealing potential vulnerabilities and decision boundaries of DNNs. Our experiments demonstrate the effectiveness of our approach in both traditional CNNs and modern ViT, highlighting its generalizability and interpretive value.

Problem

Research questions and friction points this paper is trying to address.

Understand DNN feature representations via Activation Maximization

Extend feature visualization to intermediate layers in CNNs/ViTs

Investigate AM for adversarial examples and DNN vulnerabilities

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unified feature visualization for CNNs and ViTs

Extends visualization to intermediate neural layers

Uses activation maximization for adversarial example generation

🔎 Similar Papers

No similar papers found.