Monge SAM: Robust Reparameterization-Invariant Sharpness-Aware Minimization Based on Loss Geometry

📅 2025-02-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Sharpness-Aware Minimization (SAM) methods rely on Euclidean metrics, lacking reparameterization invariance—leading to distorted correlations between sharpness and generalization. To address this, we propose Riemann-SAM, the first reparameterization-invariant sharpness-aware optimizer. It incorporates the intrinsic Riemannian metric naturally induced by the loss surface, enabling geometrically consistent search for flat minima. Our method is the first to embed the differential-geometric structure of the loss landscape into the SAM framework, synergizing Riemannian manifold optimization with adversarial perturbation-based gradient updates. Theoretically, its dynamics lie between standard gradient descent and conventional SAM, reducing hyperparameter sensitivity and mitigating saddle-point trapping. Empirically, Riemann-SAM significantly improves generalization in multimodal representation alignment tasks. It thus bridges theoretical rigor with practical robustness, offering a principled geometric foundation for sharpness-aware optimization.

Technology Category

Application Category

📝 Abstract
Recent studies on deep neural networks show that flat minima of the loss landscape correlate with improved generalization. Sharpness-aware minimization (SAM) efficiently finds flat regions by updating the parameters according to the gradient at an adversarial perturbation. The perturbation depends on the Euclidean metric, making SAM non-invariant under reparametrizations, which blurs sharpness and generalization. We propose Monge SAM (M-SAM), a reparametrization invariant version of SAM by considering a Riemannian metric in the parameter space induced naturally by the loss surface. Compared to previous approaches, M-SAM works under any modeling choice, relies only on mild assumptions while being as computationally efficient as SAM. We theoretically argue that M-SAM varies between SAM and gradient descent (GD), which increases robustness to hyperparameter selection and reduces attraction to suboptimal equilibria like saddle points. We demonstrate this behavior both theoretically and empirically on a multi-modal representation alignment task.
Problem

Research questions and friction points this paper is trying to address.

Reparameterization-invariant sharpness-aware minimization
Robustness to hyperparameter selection
Reduced attraction to suboptimal equilibria
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reparameterization-invariant SAM
Riemannian metric in parameter space
Robust hyperparameter selection
🔎 Similar Papers
No similar papers found.
A
Albert Kjøller Jacobsen
Section for Cognitive Systems, DTU Compute, Technical University of Denmark, Kongens Lyngby, Denmark
Georgios Arvanitidis
Georgios Arvanitidis
Cognitive Systems, DTU Compute, Technical University of Denmark
Machine LearningGeometry