🤖 AI Summary
This work investigates the evolution of approximation error between finite-width and infinite-width single-hidden-layer neural networks under mean-field scaling, specifically for single-index model learning tasks with arbitrarily large information exponents. We propose a differential-equation-based tight bounding technique leveraging local Hessian structure, establishing— for the first time—global-in-time (i.e., beyond logarithmic time scales) control of training dynamics error. Crucially, we identify an intrinsic “self-alignment” property inherent to single-index models and prove that polynomial-width networks—not exponential-width—are sufficient to track the mean-field dynamics exactly throughout the entire training process. This result breaks the conventional mean-field theory assumption requiring exponentially large width, thereby providing the first polynomial-width-dependent, globally valid convergence guarantee for lightweight neural networks.
📝 Abstract
We study the approximation gap between the dynamics of a polynomial-width neural network and its infinite-width counterpart, both trained using projected gradient descent in the mean-field scaling regime. We demonstrate how to tightly bound this approximation gap through a differential equation governed by the mean-field dynamics. A key factor influencing the growth of this ODE is the local Hessian of each particle, defined as the derivative of the particle's velocity in the mean-field dynamics with respect to its position. We apply our results to the canonical feature learning problem of estimating a well-specified single-index model; we permit the information exponent to be arbitrarily large, leading to convergence times that grow polynomially in the ambient dimension $d$. We show that, due to a certain ``self-concordance'' property in these problems -- where the local Hessian of a particle is bounded by a constant times the particle's velocity -- polynomially many neurons are sufficient to closely approximate the mean-field dynamics throughout training.