🤖 AI Summary
Sharpness-Aware Minimization (SAM) risks converging to “hallucinated minimizers”—points that are not true minimizers of the original objective yet satisfy SAM’s sharpness-aware stationarity conditions, thereby degrading generalization. Method: We provide the first rigorous optimization-theoretic proof that SAM can locally converge to such spurious minimizers and derive sufficient conditions for their emergence. Building on this analysis, we propose simple, computationally free modifications—e.g., gradient correction or neighborhood constraints—to provably avoid hallucinated minimizers. Contribution/Results: Both theoretical analysis and large-scale experiments (CIFAR-10/100, ImageNet) confirm that standard SAM indeed converges to hallucinated minimizers, while our modifications significantly improve convergence reliability and generalization performance. Our work establishes the first theoretical foundation for SAM’s failure modes and delivers practical, zero-overhead remedies—advancing both the robustness and deployability of sharpness-aware optimization.
📝 Abstract
Sharpness-Aware Minimization (SAM) is a widely used method that steers training toward flatter minimizers, which typically generalize better. In this work, however, we show that SAM can converge to hallucinated minimizers -- points that are not minimizers of the original objective. We theoretically prove the existence of such hallucinated minimizers and establish conditions for local convergence to them. We further provide empirical evidence demonstrating that SAM can indeed converge to these points in practice. Finally, we propose a simple yet effective remedy for avoiding hallucinated minimizers.