Understanding the Evolution of the Neural Tangent Kernel at the Edge of Stability

📅 2025-07-17

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This work investigates the dynamic evolution of Neural Tangent Kernel (NTK) eigenvectors during the “edge-of-stability” (EoS) regime of gradient descent (GD). While prior studies focus on oscillations of the largest NTK eigenvalue, systematic characterization of eigenvector dynamics remains lacking. We empirically discover—across MLPs, CNNs, and Transformers—that under large learning rates, the dominant NTK eigenvector progressively aligns with the ground-truth labels during training. To analyze this phenomenon, we integrate empirical NTK spectral analysis, GD dynamical modeling, and theoretical derivation on two-layer linear networks. Our results uncover an intrinsic coupling between spectral and directional evolution of the NTK during EoS: eigenvalues oscillate while eigenvectors adaptively align with task-relevant directions. This reveals a previously uncharacterized mechanism governing non-stationary training dynamics in deep neural networks, offering both conceptual insight and experimentally testable predictions for understanding optimization near stability boundaries.

Technology Category

Application Category

📝 Abstract

The study of Neural Tangent Kernels (NTKs) in deep learning has drawn increasing attention in recent years. NTKs typically actively change during training and are related to feature learning. In parallel, recent work on Gradient Descent (GD) has found a phenomenon called Edge of Stability (EoS), in which the largest eigenvalue of the NTK oscillates around a value inversely proportional to the step size. However, although follow-up works have explored the underlying mechanism of such eigenvalue behavior in depth, the understanding of the behavior of the NTK eigenvectors during EoS is still missing. This paper examines the dynamics of NTK eigenvectors during EoS in detail. Across different architectures, we observe that larger learning rates cause the leading eigenvectors of the final NTK, as well as the full NTK matrix, to have greater alignment with the training target. We then study the underlying mechanism of this phenomenon and provide a theoretical analysis for a two-layer linear network. Our study enhances the understanding of GD training dynamics in deep learning.

Problem

Research questions and friction points this paper is trying to address.

Study NTK eigenvector dynamics at Edge of Stability

Explore alignment of NTK eigenvectors with training targets

Analyze mechanism in two-layer linear networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes NTK eigenvector dynamics at Edge of Stability

Links large learning rates to eigenvector-target alignment

Provides theoretical analysis for two-layer linear networks

🔎 Similar Papers

High dimensional analysis reveals conservative sharpening and a stochastic edge of stability