🤖 AI Summary
Federated learning (FL) suffers from slow convergence and poor robustness under data heterogeneity and partial client participation; existing momentum methods fail to surpass FedAvg due to bias toward recently sampled clients. This paper introduces Generalized Heavy-Ball Momentum (GHBM) into FL for the first time and proposes FedHBM—a communication-efficient adaptive algorithm. FedHBM jointly models and mitigates heterogeneity-induced bias through distributed gradient momentum coupling and adaptive calibration. Its communication-intrinsic optimization mechanism requires no additional compression, making it the first adaptive framework that simultaneously incorporates principled momentum design and communication efficiency. Extensive experiments demonstrate that FedHBM significantly outperforms state-of-the-art methods across multimodal tasks: it reduces communication overhead by 42%, accelerates convergence by 2.1×, and achieves 98.7% accuracy stability under severe data heterogeneity.
📝 Abstract
Federated Learning (FL) has emerged as the state-of-the-art approach for learning from decentralized data in privacy-constrained scenarios. However, system and statistical challenges hinder real-world applications, which demand efficient learning from edge devices and robustness to heterogeneity. Despite significant research efforts, existing approaches (i) are not sufficiently robust, (ii) do not perform well in large-scale scenarios, and (iii) are not communication efficient. In this work, we propose a novel Generalized Heavy-Ball Momentum (GHBM), motivating its principled application to counteract the effects of statistical heterogeneity in FL. Then, we present FedHBM as an adaptive, communication-efficient by-design instance of GHBM. Extensive experimentation on vision and language tasks, in both controlled and realistic large-scale scenarios, provides compelling evidence of substantial and consistent performance gains over the state of the art.