🤖 AI Summary
This work identifies the key mechanism behind Sharpness-Aware Minimization’s (SAM) generalization improvement: alignment between the perturbation direction and the top eigenvector of the Hessian. However, such alignment is often insufficient in practice. To address this, we propose Eigen-SAM—the first SAM variant that explicitly regularizes the largest Hessian eigenvalue by steering gradient perturbations toward the principal Hessian eigenvector, thereby enhancing sharpness control. Our method models training dynamics via a third-order stochastic differential equation, integrating Hessian spectral estimation with eigenvector-guided perturbation. Extensive experiments across diverse vision and language benchmarks demonstrate that Eigen-SAM consistently outperforms SAM in generalization performance and convergence stability. Theoretical insights are rigorously validated empirically. Code is publicly available.
📝 Abstract
Sharpness-Aware Minimization (SAM) has attracted significant attention for its effectiveness in improving generalization across various tasks. However, its underlying principles remain poorly understood. In this work, we analyze SAM's training dynamics using the maximum eigenvalue of the Hessian as a measure of sharpness, and propose a third-order stochastic differential equation (SDE), which reveals that the dynamics are driven by a complex mixture of second- and third-order terms. We show that alignment between the perturbation vector and the top eigenvector is crucial for SAM's effectiveness in regularizing sharpness, but find that this alignment is often inadequate in practice, limiting SAM's efficiency. Building on these insights, we introduce Eigen-SAM, an algorithm that explicitly aims to regularize the top Hessian eigenvalue by aligning the perturbation vector with the leading eigenvector. We validate the effectiveness of our theory and the practical advantages of our proposed approach through comprehensive experiments. Code is available at https://github.com/RitianLuo/EigenSAM.