🤖 AI Summary
To address the degradation of generalization performance in federated learning under highly non-IID data—where state-of-the-art methods such as FedSAM even underperform FedAvg—this paper proposes FedMoSWA, a novel federated aggregation algorithm integrating Stochastic Weight Averaging (SWA), momentum-based averaging, and control variates. Its core innovation lies in momentum-guided local model weight averaging, which explicitly steers optimization toward flatter regions of the global loss landscape, thereby enhancing generalization in heterogeneous client environments. Theoretical analysis demonstrates that FedMoSWA achieves tighter bounds on both optimization error and generalization error compared to FedSAM and other baselines. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny ImageNet confirm substantial improvements in test accuracy and robustness across diverse non-IID settings. The implementation is publicly available.
📝 Abstract
For federated learning (FL) algorithms such as FedSAM, their generalization capability is crucial for real-word applications. In this paper, we revisit the generalization problem in FL and investigate the impact of data heterogeneity on FL generalization. We find that FedSAM usually performs worse than FedAvg in the case of highly heterogeneous data, and thus propose a novel and effective federated learning algorithm with Stochastic Weight Averaging (called exttt{FedSWA}), which aims to find flatter minima in the setting of highly heterogeneous data. Moreover, we introduce a new momentum-based stochastic controlled weight averaging FL algorithm ( exttt{FedMoSWA}), which is designed to better align local and global models.
Theoretically, we provide both convergence analysis and generalization bounds for exttt{FedSWA} and exttt{FedMoSWA}. We also prove that the optimization and generalization errors of exttt{FedMoSWA} are smaller than those of their counterparts, including FedSAM and its variants. Empirically, experimental results on CIFAR10/100 and Tiny ImageNet demonstrate the superiority of the proposed algorithms compared to their counterparts. Open source code at: https://github.com/junkangLiu0/FedSWA.