Geometry and Local Recovery of Global Minima of Two-layer Neural Networks at Overparameterization

📅 2023-09-01

📈 Citations: 2

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This work investigates the geometric structure of the loss landscape near global minima and the convergence behavior of gradient flow for overparameterized two-layer neural networks. We address the phenomenon wherein zero-generalization-error global minima become geometrically separated as sample size increases, and establish, for the first time, a quantitative link between this separation and generalization error—overcoming conventional analyses that focus solely on solution existence while neglecting identifiability. Method: Integrating tools from nonconvex optimization, differential geometry, and gradient flow dynamics, we develop a novel local stability analysis framework and a technique for characterizing sample complexity. Contribution/Results: Under mild assumptions, we prove local recoverability: gradient flow converges to a zero-generalization-error solution at a rate jointly determined by network width and sample size.

📝 Abstract

Under mild assumptions, we investigate the geometry of the loss landscape for two-layer neural networks in the vicinity of global minima. Utilizing novel techniques, we demonstrate: (i) how global minima with zero generalization error become geometrically separated from other global minima as the sample size grows; and (ii) the local convergence properties and rate of gradient flow dynamics. Our results indicate that two-layer neural networks can be locally recovered in the regime of overparameterization.

Problem

Research questions and friction points this paper is trying to address.

Geometry of loss landscape near global minima

Separation of global minima with zero error

Local recovery in overparameterized neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzes loss landscape geometry near global minima

Demonstrates geometric separation of global minima

Shows local recovery in overparameterization regime

🔎 Similar Papers

Critical Influence of Overparameterization on Sharpness-aware Minimization