Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs

📅 2026-02-24

📈 Citations: 0

✨ Influential: 0

career value

238K/year

🤖 AI Summary

This work addresses the unclear stability and generalization behavior of Push-Sum–type decentralized algorithms over directed graphs, where topological asymmetry introduces bias and complicates performance analysis. The authors develop a unified uniform stability framework that disentangles statistical error from topology-induced bias. They introduce an imbalance-aware consistency bound that, for the first time, decouples the topological imbalance parameter δ from the spectral gap (1−λ), revealing their joint influence on learning performance and clarifying the necessity of Push-Sum correction. By integrating uniform stability theory, Markov chain stationary distributions, and spectral graph theory, they derive a generalization error bound of Õ(1/√(mn) + γ/(δ(1−λ)) + γ) and identify the optimal early stopping time under convex objectives. In the non-convex setting satisfying the Polyak–Łojasiewicz condition, they establish optimization and generalization rates dependent on κ(1+1/(δ(1−λ))), quantifying how topology constrains algorithmic performance.

Technology Category

Application Category

📝 Abstract

Push-Sum-based decentralized learning enables optimization over directed communication networks, where information exchange may be asymmetric. While convergence properties of such methods are well understood, their finite-iteration stability and generalization behavior remain unclear due to structural bias induced by column-stochastic mixing and asymmetric error propagation. In this work, we develop a unified uniform-stability framework for the Stochastic Gradient Push (SGP) algorithm that captures the effect of directed topology. A key technical ingredient is an imbalance-aware consistency bound for Push-Sum, which controls consensus deviation through two quantities: the stationary distribution imbalance parameter $\delta$ and the spectral gap $(1-\lambda)$ governing mixing speed. This decomposition enables us to disentangle statistical effects from topology-induced bias. We establish finite-iteration stability and optimization guarantees for both convex objectives and non-convex objectives satisfying the Polyak--\L{}ojasiewicz condition. For convex problems, SGP attains excess generalization error of order $\tilde{\mathcal{O}}\!\left(\frac{1}{\sqrt{mn}}+\frac{\gamma}{\delta(1-\lambda)}+\gamma\right)$ under step-size schedules, and we characterize the corresponding optimal early stopping time that minimizes this bound. For P\L{} objectives, we obtain convex-like optimization and generalization rates with dominant dependence proportional to $\kappa\!\left(1+\frac{1}{\delta(1-\lambda)}\right)$, revealing a multiplicative coupling between problem conditioning and directed communication topology. Our analysis clarifies when Push-Sum correction is necessary compared with standard decentralized SGD and quantifies how imbalance and mixing jointly shape the best attainable learning performance.

Problem

Research questions and friction points this paper is trying to address.

Push-Sum

decentralized optimization

directed graphs

stability

generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Push-Sum

uniform stability

directed graphs