🤖 AI Summary
This paper addresses the challenge of identifying hierarchical, distributed activation patterns in neural networks. To overcome limitations of neuron-level or hand-crafted interpretable feature analyses, we propose Neural Activation Pattern (NAP) modeling based on full-layer activation distributions. Methodologically, we introduce normalized preprocessing, kernel density estimation for distribution modeling, SNR-adaptive distance metrics, and hierarchical clustering. For the first time, our approach uncovers a continuous activation manifold in neural communication receivers that is dominantly governed by signal-to-noise ratio (SNR), demonstrating its physical interpretability. Experiments show that NAP significantly improves separation between in-distribution and out-of-distribution samples, empirically confirming SNR as a critical implicit factor learned by the model. The method enhances generalization, interpretability, and reliability diagnostics—enabling robust, physics-informed analysis of deep neural representations.
📝 Abstract
Concept discovery in neural networks often targets individual neurons or human-interpretable features, overlooking distributed layer-wide patterns. We study the Neural Activation Pattern (NAP) methodology, which clusters full-layer activation distributions to identify such layer-level concepts. Applied to visual object recognition and radio receiver models, we propose improved normalization, distribution estimation, distance metrics, and varied cluster selection. In the radio receiver model, distinct concepts did not emerge; instead, a continuous activation manifold shaped by Signal-to-Noise Ratio (SNR) was observed -- highlighting SNR as a key learned factor, consistent with classical receiver behavior and supporting physical plausibility. Our enhancements to NAP improved in-distribution vs. out-of-distribution separation, suggesting better generalization and indirectly validating clustering quality. These results underscore the importance of clustering design and activation manifolds in interpreting and troubleshooting neural network behavior.