🤖 AI Summary
This work addresses the unclear impact of heterogeneous gradient estimator variances across nodes in decentralized non-convex stochastic optimization. The authors propose the D-NSS algorithm, which introduces a node-specific sampling strategy and establishes, for the first time, an upper bound on sample complexity that depends on the arithmetic mean of local standard deviations. They prove its theoretical optimality under heterogeneous variance conditions. By further integrating variance reduction techniques, they develop the D-NSS-VR algorithm, which significantly improves convergence efficiency under mean-square smoothness assumptions. Theoretical analysis shows that D-NSS achieves a tight sample complexity bound, while D-NSS-VR further reduces both communication and computational costs. Experimental results confirm the superior performance of the proposed methods in heterogeneous environments.
📝 Abstract
Decentralized optimization is critical for solving large-scale machine learning problems over distributed networks, where multiple nodes collaborate through local communication. In practice, the variances of stochastic gradient estimators often differ across nodes, yet their impact on algorithm design and complexity remains unclear. To address this issue, we propose D-NSS, a decentralized algorithm with node-specific sampling, and establish its sample complexity depending on the arithmetic mean of local standard deviations, achieving tighter bounds than existing methods that rely on the worst-case or quadratic mean. We further derive a matching sample complexity lower bound under heterogeneous variance, thereby proving the optimality of this dependence. Moreover, we extend the framework with a variance reduction technique and develop D-NSS-VR, which under the mean-squared smoothness assumption attains an improved sample complexity bound while preserving the arithmetic-mean dependence. Finally, numerical experiments validate the theoretical results and demonstrate the effectiveness of the proposed algorithms.