🤖 AI Summary
This work investigates the distributional approximation of finite-width deep neural networks by Gaussian processes in the infinite-width limit. By systematically introducing the Lindeberg exchange method for the first time, the authors iteratively replace network weights with Gaussian variables layer by layer and employ the 2-Wasserstein distance to establish a quantitative upper bound on the convergence between the network’s output distribution and its Gaussian limit. The approach accommodates general weight distributions and nonlinear activation functions satisfying mild regularity conditions, thereby providing a unified and quantitative theoretical framework that substantially advances the non-asymptotic understanding of Gaussian approximations for deep networks.
📝 Abstract
We consider the infinite-width limit of a fully connected deep neural network with general weights, and we prove quantitative general bounds on the $2$-Wasserstein distance between the network and its infinite-width Gaussian limit, under appropriate regularity assumptions on the activation function. Our main tool is a Lindeberg principle for Deep Neural Networks, which we use to successively replace the weights on each layer by Gaussian random variables.