🤖 AI Summary
This work investigates the Lipschitz constant of random ReLU neural networks under the $ell^p$ norm, where weights follow a variant of He initialization and biases are symmetrically distributed. We systematically characterize the phase-transition behavior of this constant with respect to network depth and width for $p in [1,infty]$. Leveraging tools from high-dimensional probability and random matrix theory, we establish— for the first time—nearly tight upper and lower bounds on the $ell^p$-Lipschitz constant for wide and deep random ReLU networks. Specifically, the constant grows exponentially with depth when $p in [1,2)$, but only polynomially when $p in [2,infty]$. In the wide-network regime, the bounds differ by at most a logarithmic factor; for shallow networks, they match exactly. Furthermore, our analysis uncovers an intrinsic connection between the Lipschitz constant and the $ell^p$ norm of Gaussian vectors.
📝 Abstract
This paper studies the $ell^p$-Lipschitz constants of ReLU neural networks $Φ: mathbb{R}^d o mathbb{R}$ with random parameters for $p in [1,infty]$. The distribution of the weights follows a variant of the He initialization and the biases are drawn from symmetric distributions. We derive high probability upper and lower bounds for wide networks that differ at most by a factor that is logarithmic in the network's width and linear in its depth. In the special case of shallow networks, we obtain matching bounds. Remarkably, the behavior of the $ell^p$-Lipschitz constant varies significantly between the regimes $ p in [1,2) $ and $ p in [2,infty] $. For $p in [2,infty]$, the $ell^p$-Lipschitz constant behaves similarly to $Vert gVert_{p'}$, where $g in mathbb{R}^d$ is a $d$-dimensional standard Gaussian vector and $1/p + 1/p' = 1$. In contrast, for $p in [1,2)$, the $ell^p$-Lipschitz constant aligns more closely to $Vert g Vert_{2}$.