🤖 AI Summary
This work investigates the principles underlying structural and behavioral dynamics in deep neural networks, focusing on how degeneracy in the loss landscape drives stage-wise developmental transitions during Transformer training.
Method: We propose a local learning coefficient to quantify loss landscape degeneracy and establish, for the first time, rigorous, temporally synchronized causal links between this degeneracy and internal computational structures—including attention patterns and representation disentanglement—as well as input-output behavioral phase transitions. Using full-trajectory monitoring of both language models and context-linear regression Transformers, we identify multiple empirically verifiable degeneracy-driven phase transitions.
Contribution/Results: Our findings introduce a novel paradigm grounded in singular learning theory for understanding the “developmental process” of deep networks. By revealing degeneracy as a governing principle of training dynamics, this work advances deep learning from black-box optimization toward an interpretable, mechanistic science of neural development.
📝 Abstract
Deep learning involves navigating a high-dimensional loss landscape over the neural network parameter space. Over the course of training, complex computational structures form and re-form inside the neural network, leading to shifts in input/output behavior. It is a priority for the science of deep learning to uncover principles governing the development of neural network structure and behavior. Drawing on the framework of singular learning theory, we propose that model development is deeply linked to degeneracy in the local geometry of the loss landscape. We investigate this link by monitoring loss landscape degeneracy throughout training, as quantified by the local learning coefficient, for a transformer language model and an in-context linear regression transformer. We show that training can be divided into distinct periods of change in loss landscape degeneracy, and that these changes in degeneracy coincide with significant changes in the internal computational structure and the input/output behavior of the transformers. This finding underscores the potential of a degeneracy-based perspective for understanding modern deep learning.