🤖 AI Summary
In distributed optimization, global synchronization incurs communication overhead that scales sharply with the number of nodes and edges. This paper proposes a globally asynchronous, randomized local coordination mechanism: each node independently and uniformly samples a pairwise regularizer and communicates only with the few neighbors sharing that regularizer. Its key innovation is the first method achieving a constant expected communication cost of exactly two messages per iteration—regardless of network topology—by replacing the global-sum proximal mapping with a single-regularizer proximal mapping, thereby eliminating the need for global coordination. Theoretically, for convex objectives, the algorithm attains an iteration complexity of Õ(ε⁻²); for strongly convex objectives, it achieves exact convergence in O(ε⁻¹) iterations and neighborhood convergence in O(log(1/ε)) iterations. Experiments confirm significantly reduced communication volume while maintaining convergence rates consistent with theoretical guarantees.
📝 Abstract
Distributed optimization requires nodes to coordinate, yet full synchronization scales poorly. When $n$ nodes collaborate through $m$ pairwise regularizers, standard methods demand $mathcal{O}(m)$ communications per iteration. This paper proposes randomized local coordination: each node independently samples one regularizer uniformly and coordinates only with nodes sharing that term. This exploits partial separability, where each regularizer $G_j$ depends on a subset $S_j subseteq {1,ldots,n}$ of nodes. For graph-guided regularizers where $|S_j|=2$, expected communication drops to exactly 2 messages per iteration. This method achieves $ ilde{mathcal{O}}(varepsilon^{-2})$ iterations for convex objectives and under strong convexity, $mathcal{O}(varepsilon^{-1})$ to an $varepsilon$-solution and $mathcal{O}(log(1/varepsilon))$ to a neighborhood. Replacing the proximal map of the sum $sum_j G_j$ with the proximal map of a single randomly selected regularizer $G_j$ preserves convergence while eliminating global coordination. Experiments validate both convergence rates and communication efficiency across synthetic and real-world datasets.