X-SAM: Boosting Sharpness-Aware Minimization with Dominant-Eigenvector Gradient Correction

📅 2026-01-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of Sharpness-Aware Minimization (SAM), which can still be influenced by sharp minima during optimization, thereby compromising generalization. To overcome this, the authors propose X-SAM, a novel approach grounded in spectral and geometric perspectives. X-SAM quantifies sharpness via the angle between the dominant eigenvector of the Hessian and the gradient, then explicitly aligns optimization directions by orthogonally decomposing the gradient along this eigenvector, enabling direct regularization of the largest eigenvalue. Theoretical analysis establishes the convergence and generalization benefits of X-SAM. Extensive experiments demonstrate that X-SAM consistently outperforms SAM across multiple tasks, achieving superior generalization performance and enhanced optimization stability.

Technology Category

Application Category

📝 Abstract
Sharpness-Aware Minimization (SAM) aims to improve generalization by minimizing a worst-case perturbed loss over a small neighborhood of model parameters. However, during training, its optimization behavior does not always align with theoretical expectations, since both sharp and flat regions may yield a small perturbed loss. In such cases, the gradient may still point toward sharp regions, failing to achieve the intended effect of SAM. To address this issue, we investigate SAM from a spectral and geometric perspective: specifically, we utilize the angle between the gradient and the leading eigenvector of the Hessian as a measure of sharpness. Our analysis illustrates that when this angle is less than or equal to ninety degrees, the effect of SAM's sharpness regularization can be weakened. Furthermore, we propose an explicit eigenvector-aligned SAM (X-SAM), which corrects the gradient via orthogonal decomposition along the top eigenvector, enabling more direct and efficient regularization of the Hessian's maximum eigenvalue. We prove X-SAM's convergence and superior generalization, with extensive experimental evaluations confirming both theoretical and practical advantages.
Problem

Research questions and friction points this paper is trying to address.

Sharpness-Aware Minimization
generalization
Hessian eigenvector
loss sharpness
optimization behavior
Innovation

Methods, ideas, or system contributions that make the work stand out.

Sharpness-Aware Minimization
Hessian eigenvector
gradient correction
flat minima
X-SAM
🔎 Similar Papers
No similar papers found.
H
Hongru Duan
Taiyuan University of Technology
Y
Yongle Chen
Taiyuan University of Technology
Lei Guan
Lei Guan
Member of Technical Staff, Nokia Bell Labs
Digital PredistortionWireless CommunicationsFPGAMATLAB/SimulinkHigh Performance Computing