Formation of Representations in Neural Networks

📅 2024-10-03

🏛️ arXiv.org

📈 Citations: 4

✨ Influential: 0

🤖 AI Summary

The formation mechanisms of internal representations in neural networks remain poorly understood, hindering interpretability of their “black-box” nature. Method: We propose the Canonical Representation Hypothesis (CRH), positing six universal alignment patterns among latent variables, weights, and neuron-wise gradients during training—driving the natural emergence of compact, transformation-invariant representations. CRH is integrated with the Parameter Alignment Hypothesis (PAH) into a unified dual-framework, supported by theoretical analysis of gradient-noise–regularization trade-offs, alignment-aware analysis, power-law modeling, minimal-assumption derivation, and implicit-space geometric characterization. Results: Empirical validation confirms that CRH-compliant representations exhibit compactness and robustness to task-irrelevant transformations; its breakdown triggers power-law alignment, offering a novel theoretical paradigm for generalization and structural emergence in deep learning.

Technology Category

Application Category

📝 Abstract

Understanding neural representations will help open the black box of neural networks and advance our scientific understanding of modern AI systems. However, how complex, structured, and transferable representations emerge in modern neural networks has remained a mystery. Building on previous results, we propose the Canonical Representation Hypothesis (CRH), which posits a set of six alignment relations to universally govern the formation of representations in most hidden layers of a neural network. Under the CRH, the latent representations (R), weights (W), and neuron gradients (G) become mutually aligned during training. This alignment implies that neural networks naturally learn compact representations, where neurons and weights are invariant to task-irrelevant transformations. We then show that the breaking of CRH leads to the emergence of reciprocal power-law relations between R, W, and G, which we refer to as the Polynomial Alignment Hypothesis (PAH). We present a minimal-assumption theory proving that the balance between gradient noise and regularization is crucial for the emergence of the canonical representation. The CRH and PAH lead to an exciting possibility of unifying major key deep learning phenomena, including neural collapse and the neural feature ansatz, in a single framework.

Problem

Research questions and friction points this paper is trying to address.

Understand neural representation formation

Propose Canonical Representation Hypothesis

Unify deep learning phenomena

Innovation

Methods, ideas, or system contributions that make the work stand out.

Canonical Representation Hypothesis

Polynomial Alignment Hypothesis

gradient noise regularization balance

🔎 Similar Papers

No similar papers found.

Authors to Follow