Asymptotic convexity of wide and shallow neural networks

📅 2025-06-23

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This paper addresses why wide, shallow neural networks are empirically easy to optimize—specifically, whether and how their loss landscapes become increasingly convex as width grows. Method: Focusing on single-hidden-layer networks, the authors analyze the epigraph structure of the input-output mapping in parameter space and employ high-dimensional asymptotic analysis to study the empirical risk function. Contribution/Results: They rigorously prove that, as the hidden-layer width tends to infinity, the empirical risk converges uniformly—in parameter space—to a convex function. This is the first theoretical explanation for the observed rapid convergence and scarcity of poor local minima in training wide shallow networks. The work identifies a “width-driven convexification” mechanism, distinct from classical over-parameterization arguments, highlighting how architectural width—not merely parameter redundancy—shapes optimization geometry. These findings offer a new perspective on the trainability of deep learning models.

Technology Category

Application Category

📝 Abstract

For a simple model of shallow and wide neural networks, we show that the epigraph of its input-output map as a function of the network parameters approximates epigraph of a. convex function in a precise sense. This leads to a plausible explanation of their observed good performance.

Problem

Research questions and friction points this paper is trying to address.

Analyzes convexity in wide shallow neural networks

Examines input-output map epigraph approximation

Explains good performance of such networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Wide shallow networks approximate convex functions

Input-output map epigraph mimics convexity

Explains good performance via convex approximation

🔎 Similar Papers

No similar papers found.