Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization

📅 2025-01-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work identifies the key mechanism behind Sharpness-Aware Minimization’s (SAM) generalization improvement: alignment between the perturbation direction and the top eigenvector of the Hessian. However, such alignment is often insufficient in practice. To address this, we propose Eigen-SAM—the first SAM variant that explicitly regularizes the largest Hessian eigenvalue by steering gradient perturbations toward the principal Hessian eigenvector, thereby enhancing sharpness control. Our method models training dynamics via a third-order stochastic differential equation, integrating Hessian spectral estimation with eigenvector-guided perturbation. Extensive experiments across diverse vision and language benchmarks demonstrate that Eigen-SAM consistently outperforms SAM in generalization performance and convergence stability. Theoretical insights are rigorously validated empirically. Code is publicly available.

Technology Category

Application Category

📝 Abstract
Sharpness-Aware Minimization (SAM) has attracted significant attention for its effectiveness in improving generalization across various tasks. However, its underlying principles remain poorly understood. In this work, we analyze SAM's training dynamics using the maximum eigenvalue of the Hessian as a measure of sharpness, and propose a third-order stochastic differential equation (SDE), which reveals that the dynamics are driven by a complex mixture of second- and third-order terms. We show that alignment between the perturbation vector and the top eigenvector is crucial for SAM's effectiveness in regularizing sharpness, but find that this alignment is often inadequate in practice, limiting SAM's efficiency. Building on these insights, we introduce Eigen-SAM, an algorithm that explicitly aims to regularize the top Hessian eigenvalue by aligning the perturbation vector with the leading eigenvector. We validate the effectiveness of our theory and the practical advantages of our proposed approach through comprehensive experiments. Code is available at https://github.com/RitianLuo/EigenSAM.
Problem

Research questions and friction points this paper is trying to address.

Sharpness-Aware Minimization
Machine Learning Optimization
Perturbation Vector Misalignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Eigen-SAM
Sharpness-Aware Minimization
Mathematical Eigenvalue Adjustment
🔎 Similar Papers
No similar papers found.