Position: Solve Layerwise Linear Models First to Understand Neural Dynamical Phenomena (Neural Collapse, Emergence, Lazy/Rich Regime, and Grokking)

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current deep learning theory lacks a unified explanatory framework for key dynamical phenomena—neural collapse, emergence, lazy/feature-learning regimes, sudden understanding, and generalization phase transitions. To address this, we propose a cross-scale dynamical framework grounded in layer-wise linear models as analytically tractable primitives, integrated with dynamical systems analysis, the neural tangent kernel (NTK) formalism, and linearized stability theory. Crucially, we generalize feedback mechanisms—previously studied only in simplified models—to full-scale deep networks for the first time, enabling unified modeling and analytical characterization of all five phenomena. Our framework reveals intrinsic connections among these behaviors and endows neural network dynamics with analytic tractability, predictability, and designability. It establishes a novel theoretical paradigm for foundational deep learning research.

Technology Category

Application Category

📝 Abstract
In physics, complex systems are often simplified into minimal, solvable models that retain only the core principles. In machine learning, layerwise linear models (e.g., linear neural networks) act as simplified representations of neural network dynamics. These models follow the dynamical feedback principle, which describes how layers mutually govern and amplify each other's evolution. This principle extends beyond the simplified models, successfully explaining a wide range of dynamical phenomena in deep neural networks, including neural collapse, emergence, lazy and rich regimes, and grokking. In this position paper, we call for the use of layerwise linear models retaining the core principles of neural dynamical phenomena to accelerate the science of deep learning.
Problem

Research questions and friction points this paper is trying to address.

Understand neural network dynamics using simplified layerwise linear models.
Explain phenomena like neural collapse, emergence, and grokking in deep learning.
Accelerate deep learning science by focusing on core dynamical principles.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Layerwise linear models simplify neural dynamics
Dynamical feedback principle explains neural phenomena
Core principles retained for deep learning science
🔎 Similar Papers
No similar papers found.
Yoonsoo Nam
Yoonsoo Nam
University of Oxford
theory of machine learning
S
Seok Hyeong Lee
Center for Quantum Structures in Modules and Spaces, Seoul National University, Seoul, South Korea
C
Clémentine Dominé
Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
Y
Yea Chan Park
Center for AI and Natural Science, Korea Institute For Advanced Study, Seoul, South Korea
Charles London
Charles London
DPhil Student in CS, University of Oxford
machine learninglearning theorydeep learningstatistics
W
Wonyl Choi
Department of Computer Science, Boston University, Massachusetts, United States of America
N
Niclas Goring
Department of Theoretical Physics, University of Oxford, Oxfordshire, United Kingdom
S
Seungjai Lee
Department of Mathematics, Incheon National University, Incheon, South Korea