🤖 AI Summary
This work investigates the mechanisms and characteristics underlying phase transitions in deep neural network (DNN) training dynamics. Conventional analyses lack a unified framework to characterize the shift from chaotic early-stage learning to stable late-stage convergence. Method: We propose a two-phase learning paradigm—(i) an early *sensitive exploration phase*, marked by high output sensitivity to infinitesimal parameter perturbations (“chaos effect”), and (ii) a late *stable refinement phase*, where the empirical Neural Tangent Kernel (eNTK) evolution is confined to a narrow angular cone manifold (“cone effect”). Our approach integrates interval-based state comparison, controlled perturbation analysis, temporal eNTK tracking, and angular geometric modeling. Contributions: We empirically identify a critical transition point during training, quantitatively characterize the chaos-to-stability phase transition, and demonstrate that the eNTK does not converge in the conventional sense but remains angularly constrained—a finding that revises the standard eNTK convergence assumption and advances theoretical understanding of generalization and optimization trajectories.
📝 Abstract
Understanding how deep neural networks learn remains a fundamental challenge in modern machine learning. A growing body of evidence suggests that training dynamics undergo a distinct phase transition, yet our understanding of this transition is still incomplete. In this paper, we introduce an interval-wise perspective that compares network states across a time window, revealing two new phenomena that illuminate the two-phase nature of deep learning. i) extbf{The Chaos Effect.} By injecting an imperceptibly small parameter perturbation at various stages, we show that the response of the network to the perturbation exhibits a transition from chaotic to stable, suggesting there is an early critical period where the network is highly sensitive to initial conditions; ii) extbf{The Cone Effect.} Tracking the evolution of the empirical Neural Tangent Kernel (eNTK), we find that after this transition point the model's functional trajectory is confined to a narrow cone-shaped subset: while the kernel continues to change, it gets trapped into a tight angular region. Together, these effects provide a structural, dynamical view of how deep networks transition from sensitive exploration to stable refinement during training.