Variational Learning Finds Flatter Solutions at the Edge of Stability

📅 2025-06-15

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work investigates the implicit regularization mechanism of variational learning (VL) in deep neural network training, specifically examining whether VL converges to flatter, better-generalizing minima than standard gradient descent under the “Edge of Stability” (EoS) dynamics. We introduce the EoS framework into VL theory for the first time and prove that VL—by controlling posterior covariance structure and Monte Carlo sample size—can actively steer and stabilize optimization trajectories within flatter EoS regimes. Combining theoretical analysis via quadratic models, PAC-Bayes bounds, and marginal likelihood maximization, we derive predictions empirically validated across ResNet-18 and ViT-S: VL solutions exhibit significantly higher flatness and lower test error. Our core contribution is uncovering VL’s intrinsic EoS-aware flatness preference—a novel theoretical lens for understanding its generalization advantage.

Technology Category

Application Category

📝 Abstract

Variational Learning (VL) has recently gained popularity for training deep neural networks and is competitive to standard learning methods. Part of its empirical success can be explained by theories such as PAC-Bayes bounds, minimum description length and marginal likelihood, but there are few tools to unravel the implicit regularization in play. Here, we analyze the implicit regularization of VL through the Edge of Stability (EoS) framework. EoS has previously been used to show that gradient descent can find flat solutions and we extend this result to VL to show that it can find even flatter solutions. This is obtained by controlling the posterior covariance and the number of Monte Carlo samples from the posterior. These results are derived in a similar fashion as the standard EoS literature for deep learning, by first deriving a result for a quadratic problem and then extending it to deep neural networks. We empirically validate these findings on a wide variety of large networks, such as ResNet and ViT, to find that the theoretical results closely match the empirical ones. Ours is the first work to analyze the EoS dynamics in VL.

Problem

Research questions and friction points this paper is trying to address.

Analyzing implicit regularization in Variational Learning

Extending Edge of Stability framework to Variational Learning

Demonstrating flatter solutions in deep neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

VL finds flatter solutions via Edge of Stability

Controls posterior covariance and Monte Carlo samples

Validated on ResNet and ViT networks empirically

🔎 Similar Papers

Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning