🤖 AI Summary
This work establishes the first unified convergence analysis framework for decentralized gradient descent (DGD) and diffusion algorithms optimizing strongly convex, smooth functions over arbitrary undirected topologies. The proposed methodology introduces a decoupled analytical paradigm grounded in contraction mapping theory and the Mean Hessian Theorem (MHT), enabling the first explicit separation of dynamic convergence rate from fixed-point bias—thereby independently characterizing transient convergence speed and steady-state error. The framework accommodates realistic settings including time-varying step sizes, multi-step local updates, gradient and communication noise, and stochastic network topologies. It yields tight, non-asymptotic convergence bounds both in noiseless and noisy regimes. The analysis is concise, intuitive, and broadly applicable, providing a general, scalable theoretical foundation for decentralized optimization.
📝 Abstract
The decentralized gradient descent (DGD) algorithm, and its sibling, diffusion, are workhorses in decentralized machine learning, distributed inference and estimation, and multi-agent coordination. We propose a novel, principled framework for the analysis of DGD and diffusion for strongly convex, smooth objectives, and arbitrary undirected topologies, using contraction mappings coupled with a result called the mean Hessian theorem (MHT). The use of these tools yields tight convergence bounds, both in the noise-free and noisy regimes. While these bounds are qualitatively similar to results found in the literature, our approach using contractions together with the MHT decouples the algorithm dynamics (how quickly the algorithm converges to its fixed point) from its asymptotic convergence properties (how far the fixed point is from the global optimum). This yields a simple, intuitive analysis that is accessible to a broader audience. Extensions are provided to multiple local gradient updates, time-varying step sizes, noisy gradients (stochastic DGD and diffusion), communication noise, and random topologies.