FedSWA: Improving Generalization in Federated Learning with Highly Heterogeneous Data via Momentum-Based Stochastic Controlled Weight Averaging

📅 2025-07-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the degradation of generalization performance in federated learning under highly non-IID data—where state-of-the-art methods such as FedSAM even underperform FedAvg—this paper proposes FedMoSWA, a novel federated aggregation algorithm integrating Stochastic Weight Averaging (SWA), momentum-based averaging, and control variates. Its core innovation lies in momentum-guided local model weight averaging, which explicitly steers optimization toward flatter regions of the global loss landscape, thereby enhancing generalization in heterogeneous client environments. Theoretical analysis demonstrates that FedMoSWA achieves tighter bounds on both optimization error and generalization error compared to FedSAM and other baselines. Extensive experiments on CIFAR-10, CIFAR-100, and Tiny ImageNet confirm substantial improvements in test accuracy and robustness across diverse non-IID settings. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract
For federated learning (FL) algorithms such as FedSAM, their generalization capability is crucial for real-word applications. In this paper, we revisit the generalization problem in FL and investigate the impact of data heterogeneity on FL generalization. We find that FedSAM usually performs worse than FedAvg in the case of highly heterogeneous data, and thus propose a novel and effective federated learning algorithm with Stochastic Weight Averaging (called exttt{FedSWA}), which aims to find flatter minima in the setting of highly heterogeneous data. Moreover, we introduce a new momentum-based stochastic controlled weight averaging FL algorithm ( exttt{FedMoSWA}), which is designed to better align local and global models. Theoretically, we provide both convergence analysis and generalization bounds for exttt{FedSWA} and exttt{FedMoSWA}. We also prove that the optimization and generalization errors of exttt{FedMoSWA} are smaller than those of their counterparts, including FedSAM and its variants. Empirically, experimental results on CIFAR10/100 and Tiny ImageNet demonstrate the superiority of the proposed algorithms compared to their counterparts. Open source code at: https://github.com/junkangLiu0/FedSWA.
Problem

Research questions and friction points this paper is trying to address.

Improving generalization in federated learning with heterogeneous data
Addressing performance decline of FedSAM in highly heterogeneous data
Aligning local and global models via momentum-based weight averaging
Innovation

Methods, ideas, or system contributions that make the work stand out.

Momentum-based stochastic controlled weight averaging
Finds flatter minima for heterogeneous data
Better aligns local and global models
🔎 Similar Papers
No similar papers found.
J
Junkang Liu
College of Intelligence and Computing, Tianjin University, Tianjin, China
Y
Yuanyuan Liu
School of Artificial Intelligence, Xidian University, Xi’an, China
Fanhua Shang
Fanhua Shang
Professor at Tianjin University
Machine LearningData MiningComputer Vision
Hongying Liu
Hongying Liu
Tianjin University
Machine learningImage processing
J
Jin Liu
School of Cyber Engineering, Xidian University, Xi’an, China
W
Wei Feng
College of Intelligence and Computing, Tianjin University, Tianjin, China