🤖 AI Summary
To jointly optimize robustness and communication efficiency in federated learning under Byzantine adversaries—who may arbitrarily corrupt gradients—this paper proposes Byrd-NAFL, the first Byzantine-robust federated learning algorithm integrating Nesterov-accelerated gradients. Byrd-NAFL synergistically combines Byzantine-resilient aggregation (e.g., geometric median) with a momentum correction strategy. Theoretically, it establishes finite-time convergence guarantees for non-convex, smooth loss functions while significantly relaxing stringent assumptions such as gradient boundedness. Empirically, Byrd-NAFL outperforms state-of-the-art methods across diverse Byzantine attacks, achieving faster convergence, higher final accuracy, and superior robustness—without increasing communication overhead.
📝 Abstract
We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries capable of arbitrary and potentially malicious behaviors. To simultaneously enhance communication efficiency and robustness against such adversaries, we propose a Byzantine-resilient Nesterov-Accelerated Federated Learning (Byrd-NAFL) algorithm. Byrd-NAFL seamlessly integrates Nesterov's momentum into the federated learning process alongside Byzantine-resilient aggregation rules to achieve fast and safeguarding convergence against gradient corruption. We establish a finite-time convergence guarantee for Byrd-NAFL under non-convex and smooth loss functions with relaxed assumption on the aggregated gradients. Extensive numerical experiments validate the effectiveness of Byrd-NAFL and demonstrate the superiority over existing benchmarks in terms of convergence speed, accuracy, and resilience to diverse Byzantine attack strategies.