Fully Stochastic Primal-dual Gradient Algorithm for Non-convex Optimization on Random Graphs

📅 2024-10-24

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

243K/year

🤖 AI Summary

We address non-convex distributed optimization over random time-varying undirected graphs. We propose the first provably convergent fully asynchronous decentralized algorithm for this setting. Our method integrates stochastic primal-dual updates, multi-step local stochastic gradient descent (SGD), and sparse non-blocking graph-based communication—requiring neither a global clock, explicit synchronization, nor data homogeneity assumptions. We establish that it converges to an $O(sigma/sqrt{nT})$-stationary point, matching the convergence rate of optimal synchronous algorithms. Experiments demonstrate significant improvements in communication efficiency and robustness under bandwidth constraints and dynamic topologies. Our key contribution is the first rigorous convergence guarantee for asynchronous decentralized algorithms in the non-convex regime, explicitly accommodating the randomness, asynchrony, and resource limitations inherent in real-world networks.

Technology Category

Application Category

📝 Abstract

Stochastic decentralized optimization algorithms often suffer from issues such as synchronization overhead and intermittent communication. This paper proposes a $underline{ m F}$ully $underline{ m S}$tochastic $underline{ m P}$rimal $underline{ m D}$ual gradient $underline{ m A}$lgorithm (FSPDA) that suggests an asynchronous decentralized procedure with (i) sparsified non-blocking communication on random undirected graphs and (ii) local stochastic gradient updates. FSPDA allows multiple local gradient steps to accelerate convergence to stationarity while finding a consensual solution with stochastic primal-dual updates. For problems with smooth (possibly non-convex) objective function, we show that FSPDA converges to an $mathrm{mathcal{O}( {it sigma /sqrt{nT}} )}$-stationary solution after $mathrm{it T}$ iterations without assuming data heterogeneity. The performance of FSPDA is on par with state-of-the-art algorithms whose convergence depend on static graph and synchronous updates. To our best knowledge, FSPDA is the first asynchronous algorithm that converges exactly under the non-convex setting. Numerical experiments are presented to show the benefits of FSPDA.

Problem

Research questions and friction points this paper is trying to address.

Developing efficient decentralized optimization algorithms for random networks

Addressing unreliable and bandwidth-constrained communication challenges

Achieving fast convergence on time-varying network topologies

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fully Stochastic Primal Dual Algorithm framework

Stochastic augmented Lagrangian formulation with randomness

Sparsified communication on random time-varying topologies

🔎 Similar Papers

Drago: Primal-Dual Coupled Variance Reduction for Faster Distributionally Robust Optimization