Monge SAM: Robust Reparameterization-Invariant Sharpness-Aware Minimization Based on Loss Geometry

📅 2025-02-12

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

Existing Sharpness-Aware Minimization (SAM) methods rely on Euclidean metrics, lacking reparameterization invariance—leading to distorted correlations between sharpness and generalization. To address this, we propose Riemann-SAM, the first reparameterization-invariant sharpness-aware optimizer. It incorporates the intrinsic Riemannian metric naturally induced by the loss surface, enabling geometrically consistent search for flat minima. Our method is the first to embed the differential-geometric structure of the loss landscape into the SAM framework, synergizing Riemannian manifold optimization with adversarial perturbation-based gradient updates. Theoretically, its dynamics lie between standard gradient descent and conventional SAM, reducing hyperparameter sensitivity and mitigating saddle-point trapping. Empirically, Riemann-SAM significantly improves generalization in multimodal representation alignment tasks. It thus bridges theoretical rigor with practical robustness, offering a principled geometric foundation for sharpness-aware optimization.

Technology Category

Application Category

📝 Abstract

Recent studies on deep neural networks show that flat minima of the loss landscape correlate with improved generalization. Sharpness-aware minimization (SAM) efficiently finds flat regions by updating the parameters according to the gradient at an adversarial perturbation. The perturbation depends on the Euclidean metric, making SAM non-invariant under reparametrizations, which blurs sharpness and generalization. We propose Monge SAM (M-SAM), a reparametrization invariant version of SAM by considering a Riemannian metric in the parameter space induced naturally by the loss surface. Compared to previous approaches, M-SAM works under any modeling choice, relies only on mild assumptions while being as computationally efficient as SAM. We theoretically argue that M-SAM varies between SAM and gradient descent (GD), which increases robustness to hyperparameter selection and reduces attraction to suboptimal equilibria like saddle points. We demonstrate this behavior both theoretically and empirically on a multi-modal representation alignment task.

Problem

Research questions and friction points this paper is trying to address.

Reparameterization-invariant sharpness-aware minimization

Robustness to hyperparameter selection

Reduced attraction to suboptimal equilibria

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reparameterization-invariant SAM

Riemannian metric in parameter space

Robust hyperparameter selection

🔎 Similar Papers

On Memorization and Privacy risks of Sharpness Aware Minimization