Propagation of Chaos in One-hidden-layer Neural Networks beyond Logarithmic Time

📅 2025-04-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the evolution of approximation error between finite-width and infinite-width single-hidden-layer neural networks under mean-field scaling, specifically for single-index model learning tasks with arbitrarily large information exponents. We propose a differential-equation-based tight bounding technique leveraging local Hessian structure, establishing— for the first time—global-in-time (i.e., beyond logarithmic time scales) control of training dynamics error. Crucially, we identify an intrinsic “self-alignment” property inherent to single-index models and prove that polynomial-width networks—not exponential-width—are sufficient to track the mean-field dynamics exactly throughout the entire training process. This result breaks the conventional mean-field theory assumption requiring exponentially large width, thereby providing the first polynomial-width-dependent, globally valid convergence guarantee for lightweight neural networks.

Technology Category

Application Category

📝 Abstract
We study the approximation gap between the dynamics of a polynomial-width neural network and its infinite-width counterpart, both trained using projected gradient descent in the mean-field scaling regime. We demonstrate how to tightly bound this approximation gap through a differential equation governed by the mean-field dynamics. A key factor influencing the growth of this ODE is the local Hessian of each particle, defined as the derivative of the particle's velocity in the mean-field dynamics with respect to its position. We apply our results to the canonical feature learning problem of estimating a well-specified single-index model; we permit the information exponent to be arbitrarily large, leading to convergence times that grow polynomially in the ambient dimension $d$. We show that, due to a certain ``self-concordance'' property in these problems -- where the local Hessian of a particle is bounded by a constant times the particle's velocity -- polynomially many neurons are sufficient to closely approximate the mean-field dynamics throughout training.
Problem

Research questions and friction points this paper is trying to address.

Bounding approximation gap in neural network dynamics
Analyzing impact of local Hessian on ODE growth
Estimating single-index model with polynomial neurons
Innovation

Methods, ideas, or system contributions that make the work stand out.

Tightly bound approximation gap via mean-field ODE
Analyze local Hessian impact on particle dynamics
Polynomially many neurons ensure close approximation
🔎 Similar Papers
No similar papers found.