A spring-block theory of feature learning in deep neural networks

📅 2024-07-28
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the dynamical mechanisms underlying feature learning in deep neural networks (DNNs), specifically how nonlinearity, noise, and learning rate jointly drive the collapse of high-dimensional data into low-dimensional, regular geometric structures. Method: We propose a macroscopic “spring–mass” mechanical model inspired by physical analogies and construct, for the first time, a noise–nonlinearity phase diagram to quantitatively characterize phase transitions in feature learning efficiency across shallow and deep layers. Contribution/Results: The framework unifies the description of inter-layer feature evolution dynamics and generalization performance, successfully reproducing and explaining the origins of DNN “lazy training” and “active training” regimes. It enables quantitative modeling of cross-layer feature learning strength and generalization capability. By integrating statistical physics, nonlinear dynamical systems theory, and deep learning, this work establishes a novel theoretical foundation for interpretable AI.

Technology Category

Application Category

📝 Abstract
Feature-learning deep nets progressively collapse data to a regular low-dimensional geometry. How this phenomenon emerges from collective action of nonlinearity, noise, learning rate, and other choices that shape the dynamics, has eluded first-principles theories built from microscopic neuronal dynamics. We exhibit a noise-nonlinearity phase diagram that identifies regimes where shallow or deep layers learn more effectively. We then propose a macroscopic mechanical theory that reproduces the diagram, explaining why some DNNs are lazy and some active, and linking feature learning across layers to generalization.
Problem

Research questions and friction points this paper is trying to address.

Understanding how deep nets collapse data to low-dimensional geometry
Exploring collective effects of nonlinearity, noise, and learning rate
Proposing a mechanical theory linking feature learning to generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Spring-block theory models feature learning
Noise-nonlinearity phase diagram identifies learning regimes
Mechanical theory links feature learning to generalization
🔎 Similar Papers
No similar papers found.
C
Chengzhi Shi
Departement Mathematik und Informatik, University of Basel, Spiegelgasse 1, 4051 Basel, Switzerland
L
Liming Pan
School of Cyber Science and Technology, University of Science and Technology of China, 230026, Hefei, China
Ivan Dokmanić
Ivan Dokmanić
Associate Professor, Department of Mathematics and Computer Science, University of Basel
Signal ProcessingMachine LearningInverse Problems