Uniform-in-time concentration in two-layer neural networks via transportation inequalities

📅 2026-03-02

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This work investigates the prediction error between the empirical parameter distribution of a two-layer neural network trained via stochastic gradient descent (SGD) and its mean-field limit. By establishing a Talagrand-type transport inequality along the SGD trajectory, the authors obtain, for the first time, an explicit constant independent of the number of iterations. They prove time-uniform and dimension-free concentration of the empirical measure around the mean-field limit in both Wasserstein-1 (W₁) and sliced Wasserstein-1 (SW₁) distances. This concentration result yields high-probability bounds on the prediction error for fixed test functions, achieving a time-uniform and dimension-independent convergence rate. The analysis provides new non-asymptotic guarantees for SGD under the mean-field framework.

Technology Category

Application Category

📝 Abstract

We quantify, uniformly over time and with high probability, the discrepancy between the predictions of a two-layer neural network trained by stochastic gradient descent (SGD) and their mean-field limit, for quadratic loss and ridge regularization. As a key ingredient, we establish T p transportation inequalities (p $\in$ {1, 2}) for the law of the SGD parameters, with explicit constants independent of the iteration index. We then prove uniform-in-time concentration of the empirical parameter measure around its mean-field limit in the Wasserstein distance W 1 , and we translate these bounds into prediction-error estimates against a fixed test function $Φ$. We also derive analogous concentration bounds in the sliced-Wasserstein distance SW 1 , leading to dimension-free rates.

Problem

Research questions and friction points this paper is trying to address.

two-layer neural networks

mean-field limit

uniform-in-time concentration

stochastic gradient descent

Wasserstein distance

Innovation

Methods, ideas, or system contributions that make the work stand out.

transportation inequalities

uniform-in-time concentration

mean-field limit