Unified Analysis of Decentralized Gradient Descent: a Contraction Mapping Framework

📅 2025-03-18

📈 Citations: 0

✨ Influential: 0

career value

240K/year

🤖 AI Summary

This work establishes the first unified convergence analysis framework for decentralized gradient descent (DGD) and diffusion algorithms optimizing strongly convex, smooth functions over arbitrary undirected topologies. The proposed methodology introduces a decoupled analytical paradigm grounded in contraction mapping theory and the Mean Hessian Theorem (MHT), enabling the first explicit separation of dynamic convergence rate from fixed-point bias—thereby independently characterizing transient convergence speed and steady-state error. The framework accommodates realistic settings including time-varying step sizes, multi-step local updates, gradient and communication noise, and stochastic network topologies. It yields tight, non-asymptotic convergence bounds both in noiseless and noisy regimes. The analysis is concise, intuitive, and broadly applicable, providing a general, scalable theoretical foundation for decentralized optimization.

Technology Category

Application Category

📝 Abstract

The decentralized gradient descent (DGD) algorithm, and its sibling, diffusion, are workhorses in decentralized machine learning, distributed inference and estimation, and multi-agent coordination. We propose a novel, principled framework for the analysis of DGD and diffusion for strongly convex, smooth objectives, and arbitrary undirected topologies, using contraction mappings coupled with a result called the mean Hessian theorem (MHT). The use of these tools yields tight convergence bounds, both in the noise-free and noisy regimes. While these bounds are qualitatively similar to results found in the literature, our approach using contractions together with the MHT decouples the algorithm dynamics (how quickly the algorithm converges to its fixed point) from its asymptotic convergence properties (how far the fixed point is from the global optimum). This yields a simple, intuitive analysis that is accessible to a broader audience. Extensions are provided to multiple local gradient updates, time-varying step sizes, noisy gradients (stochastic DGD and diffusion), communication noise, and random topologies.

Problem

Research questions and friction points this paper is trying to address.

Analyzes decentralized gradient descent for convex objectives.

Provides tight convergence bounds using contraction mappings.

Extends analysis to stochastic and time-varying scenarios.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses contraction mapping for DGD analysis

Incorporates mean Hessian theorem for tight bounds

Extends to stochastic gradients and random topologies

🔎 Similar Papers

Convergence of Decentralized Stochastic Subgradient-based Methods for Nonsmooth Nonconvex functions