LoDAdaC: a unified local training-based decentralized framework with adaptive gradients and compressed communication

📅 2026-04-11

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the challenge in decentralized distributed learning of simultaneously achieving fast convergence and low communication overhead, particularly the lack of theoretical guarantees for integrating multi-round local training, adaptive gradient optimization, and communication compression. The paper proposes LoDAdaC, a novel framework that, for the first time, unifies Adam-type adaptive optimizers (e.g., Adam, AMSGrad, AdaGrad), general compression operators (e.g., low-bit quantization and sparsification), and multiple local update steps within a decentralized setting, accompanied by rigorous convergence analysis. Empirical results demonstrate that LoDAdaC substantially outperforms existing methods on both image classification and GPT-style language model training, delivering significant improvements in both convergence speed and communication efficiency.

Technology Category

Application Category

📝 Abstract

In the decentralized distributed learning, achieving fast convergence and low communication cost is essential for scalability and high efficiency. Adaptive gradient methods, such as Adam, have demonstrated strong practical performance in deep learning and centralized distributed settings. However, their convergence properties remain largely unexplored in decentralized settings involving multiple local training steps, such as federated learning. To address this limitation, we propose LoDAdaC, a unified multiple Local Training (MLT) Decentralized framework with Adam-type updates and Compressed communication (CC). LoDAdaC accommodates a broad class of optimizers for its local adaptive updates, including AMSGrad, Adam, and AdaGrad; it is compatible with standard (possibly biased) compressors such as low-bit quantization and sparsification. MLT and CC enable LoDAdaC to achieve multiplied reduction of communication cost, while the technique of adaptive updates enables fast convergence. We rigorously prove the combined advantage through complexity analysis. In addition, experiments on image classification and GPT-style language model training validate our theoretical findings and show that LoDAdaC significantly outperforms existing decentralized algorithms in terms of convergence speed and communication efficiency.

Problem

Research questions and friction points this paper is trying to address.

decentralized learning

adaptive gradients

compressed communication

local training

convergence

Innovation

Methods, ideas, or system contributions that make the work stand out.

decentralized learning

adaptive gradients

compressed communication