A Bias-Correction Decentralized Stochastic Gradient Algorithm with Momentum Acceleration

📅 2025-01-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address convergence bias and reduced efficiency in distributed learning caused by data heterogeneity and network sparsity, this paper proposes EDM, a momentum-accelerated decentralized stochastic gradient algorithm. EDM innovatively integrates momentum into the Exact-Diffusion framework—its first such incorporation—enabling provably correct correction of heterogeneity-induced bias without global synchronization or a central coordinator. Under non-convex and Polyak–Łojasiewicz (PL) assumptions, EDM achieves distribution-independent sublinear and linear convergence rates, respectively. Experiments demonstrate that EDM significantly outperforms state-of-the-art decentralized algorithms under heterogeneous data distributions and sparse network topologies, achieving faster convergence and lower steady-state error.

Technology Category

Application Category

📝 Abstract
Distributed stochastic optimization algorithms can handle large-scale data simultaneously and accelerate model training. However, the sparsity of distributed networks and the heterogeneity of data limit these advantages. This paper proposes a momentum-accelerated distributed stochastic gradient algorithm, referred to as Exact-Diffusion with Momentum (EDM), which can correct the bias caused by data heterogeneity and introduces the momentum method commonly used in deep learning to accelerate the convergence of the algorithm. We theoretically demonstrate that this algorithm converges to the neighborhood of the optimum sub-linearly irrelevant to data heterogeneity when applied to non-convex objective functions and linearly under the Polyak-{L}ojasiewicz condition (a weaker assumption than $mu$-strongly convexity). Finally, we evaluate the performance of the proposed algorithm by simulation, comparing it with a range of existing decentralized optimization algorithms to demonstrate its effectiveness in addressing data heterogeneity and network sparsity.
Problem

Research questions and friction points this paper is trying to address.

Distributed Optimization Algorithms
Network Sparsity
Data Imbalance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Momentum-based Exact Diffusion (EDM)
Data Heterogeneity Correction
Optimal Solution Acceleration
🔎 Similar Papers
No similar papers found.
Y
Yuchen Hu
School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, 200240, China
X
Xi Chen
New York University, New York, NY 10012, USA
W
Weidong Liu
School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, 200240, China
Xiaojun Mao
Xiaojun Mao
Shanghai Jiao Tong University