Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity

📅 2025-03-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the optimal complexity of Byzantine-robust distributed stochastic optimization under data heterogeneity. We first establish a fundamental decomposition of the convergence error into two components: an irreducible Byzantine bias and a vanishing optimization error, and derive tight lower bounds for both. For strongly convex and nonconvex objective functions, we propose novel algorithms that integrate Nesterov acceleration with variance reduction, coupled with a robust aggregation mechanism and a distributed stochastic gradient update framework. Our methods achieve optimal query complexity—up to logarithmic factors—in both settings, matching the derived lower bounds exactly. This resolves a long-standing theoretical gap in Byzantine-robust distributed optimization and significantly improves upon prior approaches in terms of both complexity guarantees and practical robustness under heterogeneous data distributions.

Technology Category

Application Category

📝 Abstract
In this paper, we establish tight lower bounds for Byzantine-robust distributed first-order stochastic optimization methods in both strongly convex and non-convex stochastic optimization. We reveal that when the distributed nodes have heterogeneous data, the convergence error comprises two components: a non-vanishing Byzantine error and a vanishing optimization error. We establish the lower bounds on the Byzantine error and on the minimum number of queries to a stochastic gradient oracle required to achieve an arbitrarily small optimization error. Nevertheless, we identify significant discrepancies between our established lower bounds and the existing upper bounds. To fill this gap, we leverage the techniques of Nesterov's acceleration and variance reduction to develop novel Byzantine-robust distributed stochastic optimization methods that provably match these lower bounds, up to logarithmic factors, implying that our established lower bounds are tight.
Problem

Research questions and friction points this paper is trying to address.

Establishes tight lower bounds for Byzantine-robust distributed stochastic optimization.
Analyzes convergence error with heterogeneous data in distributed nodes.
Develops novel methods matching lower bounds using Nesterov's acceleration.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nesterov's acceleration for Byzantine robustness
Variance reduction in distributed optimization
Matching lower bounds with novel methods
🔎 Similar Papers
No similar papers found.
Q
Qiankun Shi
School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, China; Pengcheng Laboratory, Shenzhen, China
J
Jie Peng
School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, China
K
Kun Yuan
Center for Machine Learning Research, Peking University, Beijing, China
X
Xiao Wang
School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, China
Qing Ling
Qing Ling
School of Computer Science and Engineering, Sun Yat-Sen University
Signal ProcessingOptimizationControl