🤖 AI Summary
This work addresses the limitation of Sharpness-Aware Minimization (SAM), which can still be influenced by sharp minima during optimization, thereby compromising generalization. To overcome this, the authors propose X-SAM, a novel approach grounded in spectral and geometric perspectives. X-SAM quantifies sharpness via the angle between the dominant eigenvector of the Hessian and the gradient, then explicitly aligns optimization directions by orthogonally decomposing the gradient along this eigenvector, enabling direct regularization of the largest eigenvalue. Theoretical analysis establishes the convergence and generalization benefits of X-SAM. Extensive experiments demonstrate that X-SAM consistently outperforms SAM across multiple tasks, achieving superior generalization performance and enhanced optimization stability.
📝 Abstract
Sharpness-Aware Minimization (SAM) aims to improve generalization by minimizing a worst-case perturbed loss over a small neighborhood of model parameters. However, during training, its optimization behavior does not always align with theoretical expectations, since both sharp and flat regions may yield a small perturbed loss. In such cases, the gradient may still point toward sharp regions, failing to achieve the intended effect of SAM. To address this issue, we investigate SAM from a spectral and geometric perspective: specifically, we utilize the angle between the gradient and the leading eigenvector of the Hessian as a measure of sharpness. Our analysis illustrates that when this angle is less than or equal to ninety degrees, the effect of SAM's sharpness regularization can be weakened. Furthermore, we propose an explicit eigenvector-aligned SAM (X-SAM), which corrects the gradient via orthogonal decomposition along the top eigenvector, enabling more direct and efficient regularization of the Hessian's maximum eigenvalue. We prove X-SAM's convergence and superior generalization, with extensive experimental evaluations confirming both theoretical and practical advantages.