From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks

📅 2024-09-22
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
This work investigates the precise influence of initialization on learning dynamics in deep linear networks, specifically how it governs the transition of representation evolution from the “lazy” regime (static representations, constant Neural Tangent Kernel—NTK) to the “rich” regime (dynamic representations, feature learning). Method: We introduce the λ-balanced initialization framework and derive, for the first time, closed-form analytical solutions for the entire training trajectory, jointly characterizing the co-evolution of weights, hidden-layer representations, and the NTK. Contribution/Results: Our theory quantitatively identifies the critical initialization scale separating lazy and rich learning, revealing how initialization determines the paradigm shift. The results yield testable theoretical criteria for continual learning, reverse learning, and transfer learning, thereby addressing a fundamental limitation of existing NTK theory—its inability to model dynamic representation learning.

Technology Category

Application Category

📝 Abstract
Biological and artificial neural networks develop internal representations that enable them to perform complex tasks. In artificial networks, the effectiveness of these models relies on their ability to build task specific representation, a process influenced by interactions among datasets, architectures, initialization strategies, and optimization algorithms. Prior studies highlight that different initializations can place networks in either a lazy regime, where representations remain static, or a rich/feature learning regime, where representations evolve dynamically. Here, we examine how initialization influences learning dynamics in deep linear neural networks, deriving exact solutions for lambda-balanced initializations-defined by the relative scale of weights across layers. These solutions capture the evolution of representations and the Neural Tangent Kernel across the spectrum from the rich to the lazy regimes. Our findings deepen the theoretical understanding of the impact of weight initialization on learning regimes, with implications for continual learning, reversal learning, and transfer learning, relevant to both neuroscience and practical applications.
Problem

Research questions and friction points this paper is trying to address.

Examines influence of initialization on learning dynamics
Derives exact solutions for lambda-balanced initializations
Explores impact on learning regimes and Neural Tangent Kernel
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exact solutions for lambda-balanced initializations
Evolution of representations and Neural Tangent Kernel
Impact of weight initialization on learning regimes
🔎 Similar Papers
No similar papers found.