Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)

📅 2025-02-28

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Current deep learning theory lacks a unified explanatory framework for key dynamical phenomena—neural collapse, emergence, lazy/feature-learning regimes, sudden understanding, and generalization phase transitions. To address this, we propose a cross-scale dynamical framework grounded in layer-wise linear models as analytically tractable primitives, integrated with dynamical systems analysis, the neural tangent kernel (NTK) formalism, and linearized stability theory. Crucially, we generalize feedback mechanisms—previously studied only in simplified models—to full-scale deep networks for the first time, enabling unified modeling and analytical characterization of all five phenomena. Our framework reveals intrinsic connections among these behaviors and endows neural network dynamics with analytic tractability, predictability, and designability. It establishes a novel theoretical paradigm for foundational deep learning research.

Technology Category

Application Category

📝 Abstract

In physics, complex systems are often simplified into minimal, solvable models that retain only the core principles. In machine learning, layerwise linear models (e.g., linear neural networks) act as simplified representations of neural network dynamics. These models follow the dynamical feedback principle, which describes how layers mutually govern and amplify each other's evolution. This principle extends beyond the simplified models, successfully explaining a wide range of dynamical phenomena in deep neural networks, including neural collapse, emergence, lazy and rich regimes, and grokking. In this position paper, we call for the use of layerwise linear models retaining the core principles of neural dynamical phenomena to accelerate the science of deep learning.

Problem

Research questions and friction points this paper is trying to address.

Understand neural network dynamics using simplified layerwise linear models.

Explain phenomena like neural collapse, emergence, and grokking in deep learning.

Accelerate deep learning science by focusing on core dynamical principles.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Layerwise linear models simplify neural dynamics

Dynamical feedback principle explains neural phenomena

Core principles retained for deep learning science

🔎 Similar Papers

A spring-block theory of feature learning in deep neural networks