Random ReLU Neural Networks as Non-Gaussian Processes

📅 2024-05-16

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

This work investigates the limiting behavior of randomly initialized shallow ReLU neural networks as width tends to infinity, challenging the conventional belief that such limits invariably converge to Gaussian processes. Method: Adopting a non-asymptotic framework, we model neuron density as a Poisson random variable and leverage tools from stochastic processes, Poisson point processes, and non-Gaussian limit theorems. Contribution/Results: We rigorously establish that the network defines a well-posed non-Gaussian stochastic process—specifically, an isotropic impulsive white noise-driven process exhibiting wide-sense self-similarity with Hurst exponent 3/2. We derive a closed-form expression for its autocovariance function, fully characterize its parametric mechanism, and uncover a dual convergence regime: as width increases, the network simultaneously approaches both Gaussian and non-Gaussian limits. This refutes the long-standing “necessarily Gaussian” paradigm and provides the first non-asymptotic characterization of non-Gaussian neural network limits.

Technology Category

Application Category

📝 Abstract

We consider a large class of shallow neural networks with randomly initialized parameters and rectified linear unit activation functions. We prove that these random neural networks are well-defined non-Gaussian processes. As a by-product, we demonstrate that these networks are solutions to stochastic differential equations driven by impulsive white noise (combinations of random Dirac measures). These processes are parameterized by the law of the weights and biases as well as the density of activation thresholds in each bounded region of the input domain. We prove that these processes are isotropic and wide-sense self-similar with Hurst exponent 3/2. We also derive a remarkably simple closed-form expression for their autocovariance function. Our results are fundamentally different from prior work in that we consider a non-asymptotic viewpoint: The number of neurons in each bounded region of the input domain (i.e., the width) is itself a random variable with a Poisson law with mean proportional to the density parameter. Finally, we show that, under suitable hypotheses, as the expected width tends to infinity, these processes can converge in law not only to Gaussian processes, but also to non-Gaussian processes depending on the law of the weights. Our asymptotic results provide a new take on several classical results (wide networks converge to Gaussian processes) as well as some new ones (wide networks can converge to non-Gaussian processes).

Problem

Research questions and friction points this paper is trying to address.

Random ReLU neural networks properties

Non-Gaussian processes characterization

Stochastic differential equations solutions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Random ReLU Neural Networks

Non-Gaussian Processes Analysis

Stochastic Differential Equations Driven

🔎 Similar Papers

Large Deviations of Gaussian Neural Networks with ReLU activation