🤖 AI Summary
This work investigates the dynamical mechanisms underlying feature learning in deep neural networks (DNNs), specifically how nonlinearity, noise, and learning rate jointly drive the collapse of high-dimensional data into low-dimensional, regular geometric structures.
Method: We propose a macroscopic “spring–mass” mechanical model inspired by physical analogies and construct, for the first time, a noise–nonlinearity phase diagram to quantitatively characterize phase transitions in feature learning efficiency across shallow and deep layers.
Contribution/Results: The framework unifies the description of inter-layer feature evolution dynamics and generalization performance, successfully reproducing and explaining the origins of DNN “lazy training” and “active training” regimes. It enables quantitative modeling of cross-layer feature learning strength and generalization capability. By integrating statistical physics, nonlinear dynamical systems theory, and deep learning, this work establishes a novel theoretical foundation for interpretable AI.
📝 Abstract
Feature-learning deep nets progressively collapse data to a regular low-dimensional geometry. How this phenomenon emerges from collective action of nonlinearity, noise, learning rate, and other choices that shape the dynamics, has eluded first-principles theories built from microscopic neuronal dynamics. We exhibit a noise-nonlinearity phase diagram that identifies regimes where shallow or deep layers learn more effectively. We then propose a macroscopic mechanical theory that reproduces the diagram, explaining why some DNNs are lazy and some active, and linking feature learning across layers to generalization.