Asymptotic convexity of wide and shallow neural networks

📅 2025-06-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses why wide, shallow neural networks are empirically easy to optimize—specifically, whether and how their loss landscapes become increasingly convex as width grows. Method: Focusing on single-hidden-layer networks, the authors analyze the epigraph structure of the input-output mapping in parameter space and employ high-dimensional asymptotic analysis to study the empirical risk function. Contribution/Results: They rigorously prove that, as the hidden-layer width tends to infinity, the empirical risk converges uniformly—in parameter space—to a convex function. This is the first theoretical explanation for the observed rapid convergence and scarcity of poor local minima in training wide shallow networks. The work identifies a “width-driven convexification” mechanism, distinct from classical over-parameterization arguments, highlighting how architectural width—not merely parameter redundancy—shapes optimization geometry. These findings offer a new perspective on the trainability of deep learning models.

Technology Category

Application Category

📝 Abstract
For a simple model of shallow and wide neural networks, we show that the epigraph of its input-output map as a function of the network parameters approximates epigraph of a. convex function in a precise sense. This leads to a plausible explanation of their observed good performance.
Problem

Research questions and friction points this paper is trying to address.

Analyzes convexity in wide shallow neural networks
Examines input-output map epigraph approximation
Explains good performance of such networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Wide shallow networks approximate convex functions
Input-output map epigraph mimics convexity
Explains good performance via convex approximation
🔎 Similar Papers
No similar papers found.
V
Vivek S. Borkar
Department of Electrical Engineering (retd.), Indian Institute of Technology Bombay
Parthe Pandit
Parthe Pandit
Thakur Family Chair Assistant Professor @ IIT Bombay
Machine learningStatisticsOptimizationSignal processing