Near-optimal estimates for the $ell^p$-Lipschitz constants of deep random ReLU neural networks

📅 2025-06-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the Lipschitz constant of random ReLU neural networks under the $ell^p$ norm, where weights follow a variant of He initialization and biases are symmetrically distributed. We systematically characterize the phase-transition behavior of this constant with respect to network depth and width for $p in [1,infty]$. Leveraging tools from high-dimensional probability and random matrix theory, we establish— for the first time—nearly tight upper and lower bounds on the $ell^p$-Lipschitz constant for wide and deep random ReLU networks. Specifically, the constant grows exponentially with depth when $p in [1,2)$, but only polynomially when $p in [2,infty]$. In the wide-network regime, the bounds differ by at most a logarithmic factor; for shallow networks, they match exactly. Furthermore, our analysis uncovers an intrinsic connection between the Lipschitz constant and the $ell^p$ norm of Gaussian vectors.

Technology Category

Application Category

📝 Abstract
This paper studies the $ell^p$-Lipschitz constants of ReLU neural networks $Φ: mathbb{R}^d o mathbb{R}$ with random parameters for $p in [1,infty]$. The distribution of the weights follows a variant of the He initialization and the biases are drawn from symmetric distributions. We derive high probability upper and lower bounds for wide networks that differ at most by a factor that is logarithmic in the network's width and linear in its depth. In the special case of shallow networks, we obtain matching bounds. Remarkably, the behavior of the $ell^p$-Lipschitz constant varies significantly between the regimes $ p in [1,2) $ and $ p in [2,infty] $. For $p in [2,infty]$, the $ell^p$-Lipschitz constant behaves similarly to $Vert gVert_{p'}$, where $g in mathbb{R}^d$ is a $d$-dimensional standard Gaussian vector and $1/p + 1/p' = 1$. In contrast, for $p in [1,2)$, the $ell^p$-Lipschitz constant aligns more closely to $Vert g Vert_{2}$.
Problem

Research questions and friction points this paper is trying to address.

Estimates Lipschitz constants of random ReLU networks
Compares bounds for different network widths and depths
Analyzes varying behavior across p-norm regimes [1,2) and [2,∞]
Innovation

Methods, ideas, or system contributions that make the work stand out.

Random ReLU networks with He initialization
High probability bounds for Lipschitz constants
Different behavior for p in [1,2) vs [2,∞]
🔎 Similar Papers
No similar papers found.