On the Cone Effect in the Learning Dynamics

📅 2025-03-20

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

Understanding the learning dynamics of neural networks during real-world training remains challenging, particularly regarding the evolution of the empirical Neural Tangent Kernel (eNTK) and its geometric implications. Method: We systematically track eNTK trajectories, analyze matrix similarity over training time, and model manifold constraints to characterize kernel evolution. Contribution/Results: We empirically identify and name the “cone effect”—a previously unreported two-phase dynamic: an initial rapid eNTK evolution phase (“rich regime”) followed by convergence into a low-dimensional conical subspace. This effect refines Fort et al.’s two-stage hypothesis by revealing an intrinsic geometric constraint mechanism beyond linearization assumptions. Extensive experiments across diverse architectures (e.g., ResNet, ViT) and datasets (CIFAR-10/100, ImageNet subsets) confirm the universality of the cone effect. It consistently improves generalization performance and convergence stability, offering a novel geometric perspective on the nonlinear optimization landscape in deep learning.

Technology Category

Application Category

📝 Abstract

Understanding the learning dynamics of neural networks is a central topic in the deep learning community. In this paper, we take an empirical perspective to study the learning dynamics of neural networks in real-world settings. Specifically, we investigate the evolution process of the empirical Neural Tangent Kernel (eNTK) during training. Our key findings reveal a two-phase learning process: i) in Phase I, the eNTK evolves significantly, signaling the rich regime, and ii) in Phase II, the eNTK keeps evolving but is constrained in a narrow space, a phenomenon we term the cone effect. This two-phase framework builds on the hypothesis proposed by Fort et al. (2020), but we uniquely identify the cone effect in Phase II, demonstrating its significant performance advantages over fully linearized training.

Problem

Research questions and friction points this paper is trying to address.

Study learning dynamics of neural networks empirically.

Investigate evolution of empirical Neural Tangent Kernel (eNTK).

Identify and analyze the cone effect in Phase II.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical study of Neural Tangent Kernel evolution

Identifies two-phase learning process in neural networks

Demonstrates cone effect enhancing training performance

🔎 Similar Papers

Multimodal Methods for Analyzing Learning and Training Environments: A Systematic Literature Review