Random Matrix Theory of Early-Stopped Gradient Flow: A Transient BBP Scenario

📅 2026-04-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

204K/year
🤖 AI Summary
This work uncovers the spectral mechanism underlying early stopping in deep learning: gradient descent can detect the teacher signal only within a finite time window before overfitting obscures it. Focusing on a linear teacher-student model, the study characterizes fast and slow learning directions through the anisotropy of the input covariance matrix and, for the first time, incorporates the Baik–Ben Arous–Péché (BBP) transient phase transition into early stopping analysis. It identifies three distinct phases of signal behavior—“never emerging,” “persistently present,” or “transiently visible”—and constructs the complete phase diagram. Using a 2×2 Dyson equation, the authors derive the time-dependent population spectrum and apply a rank-two determinant formula to determine outlier conditions for rank-one signals. Theoretical predictions show excellent agreement with finite-size simulations, establishing a minimal yet analytically tractable early stopping mechanism governed by anisotropy and noise.

Technology Category

Application Category

📝 Abstract
Empirical studies of trained models often report a transient regime in which signal is detectable in a finite gradient descent time window before overfitting dominates. We provide an analytically tractable random-matrix model that reproduces this phenomenon for gradient flow in a linear teacher--student setting. In this framework, learning occurs when an isolated eigenvalue separates from a noisy bulk, before eventually disappearing in the overfitting regime. The key ingredient is anisotropy in the input covariance, which induces fast and slow directions in the learning dynamics. In a two-block covariance model, we derive the full time-dependent bulk spectrum of the symmetrized weight matrix through a $2\times 2$ Dyson equation, and we obtain an explicit outlier condition for a rank-one teacher via a rank-two determinant formula. This yields a transient Baik-Ben Arous-Péché (BBP) transition: depending on signal strength and covariance anisotropy, the teacher spike may never emerge, emerge and persist, or emerge only during an intermediate time interval before being reabsorbed into the bulk. We map the corresponding phase diagrams and validate the theory against finite-size simulations. Our results provide a minimal solvable mechanism for early stopping as a transient spectral effect driven by anisotropy and noise.
Problem

Research questions and friction points this paper is trying to address.

early stopping
transient regime
random matrix theory
BBP transition
anisotropy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random Matrix Theory
Early Stopping
Transient BBP Transition
Anisotropic Covariance
Gradient Flow
🔎 Similar Papers
No similar papers found.