FedVSSAM: Mitigating Flatness Incompatibility in Sharpness-Aware Federated Learning

📅 2026-05-09

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This work addresses the challenge of “flatness incompatibility” in heterogeneous federated learning, where flat minima found via local sharpness-aware optimization do not align with the flat regions required by the global objective, thereby limiting model generalization. The study is the first to formally identify and characterize this phenomenon and proposes FedVSSAM, a novel method that introduces a variance-suppression mechanism to consistently anchor both local perturbations and update directions toward a more stable global gradient direction. This alignment ensures coherence across local flatness-aware search, descent steps, and global aggregation. Theoretical analysis demonstrates that the mean squared deviation between the adjusted update direction and the true global gradient remains bounded, while extensive experiments show that FedVSSAM significantly outperforms existing baselines across diverse federated settings.

📝 Abstract

Sharpness-aware minimization (SAM) is an effective method for improving the generalization of federated learning (FL) by steering local training toward flat minima. Under data heterogeneity, however, device-side SAM searches for locally flat basins that are incompatible with the flat region preferred by the global objective. We identify this structural failure mode as flatness incompatibility, which explains why improving local flatness alone may provide limited training and generalization improvement for the global model. We reveal that flatness incompatibility arises from data heterogeneity and the friendly adversary phenomenon, and is further amplified by local updates and partial device participation. To mitigate this issue, we propose Federated Learning with variance-suppressed sharpness-aware minimization (FedVSSAM), which constructs a variance-suppressed adjusted direction and uses it consistently in local flatness search, local descent, and global update. FedVSSAM anchors both perturbation and update directions to a more stable global direction, instead of correcting only an isolated local perturbation. We establish non-convex convergence guarantees of FedVSSAM and prove that the mean-square deviation between the adjusted direction and the global gradient is effectively controlled. Experiments demonstrate that FedVSSAM mitigates flatness incompatibility and outperforms the baselines across diverse FL settings.

Problem

Research questions and friction points this paper is trying to address.

flatness incompatibility

sharpness-aware minimization

federated learning

data heterogeneity

friendly adversary phenomenon

Innovation

Methods, ideas, or system contributions that make the work stand out.

flatness incompatibility

federated learning

sharpness-aware minimization