FedWCM: Unleashing the Potential of Momentum-based Federated Learning in Long-Tailed Scenarios

📅 2025-07-20

📈 Citations: 0

✨ Influential: 0

career value

207K/year

🤖 AI Summary

Federated learning (FL) suffers from convergence difficulties and model bias under non-IID long-tailed data due to momentum-induced gradient drift. To address this, we propose FedWCM—the first FL optimizer that jointly leverages hierarchical neural network analysis and dynamic momentum calibration. FedWCM adaptively adjusts per-layer momentum parameters based on each client’s local data distribution and the global model state, explicitly correcting directional bias in stochastic gradients. It requires no additional labels or access to global data, making it inherently compatible with client heterogeneity and class imbalance. Evaluated on multiple long-tailed non-IID benchmarks—including CIFAR-10-LT and ImageNet-LT—FedWCM achieves significantly faster convergence and higher final accuracy than state-of-the-art FL methods (e.g., FedAvg, FedProx, SCAFFOLD). It robustly resolves the non-convergence and bias issues plaguing momentum-based FL in long-tailed settings.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) enables decentralized model training while preserving data privacy. Despite its benefits, FL faces challenges with non-identically distributed (non-IID) data, especially in long-tailed scenarios with imbalanced class samples. Momentum-based FL methods, often used to accelerate FL convergence, struggle with these distributions, resulting in biased models and making FL hard to converge. To understand this challenge, we conduct extensive investigations into this phenomenon, accompanied by a layer-wise analysis of neural network behavior. Based on these insights, we propose FedWCM, a method that dynamically adjusts momentum using global and per-round data to correct directional biases introduced by long-tailed distributions. Extensive experiments show that FedWCM resolves non-convergence issues and outperforms existing methods, enhancing FL's efficiency and effectiveness in handling client heterogeneity and data imbalance.

Problem

Research questions and friction points this paper is trying to address.

Addresses non-IID data challenges in Federated Learning

Improves momentum-based FL in long-tailed class imbalance

Corrects directional biases for better FL convergence

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic momentum adjustment for FL bias correction

Layer-wise neural network behavior analysis

Global and per-round data utilization

🔎 Similar Papers

Communication-Efficient Heterogeneous Federated Learning with Generalized Heavy-Ball Momentum