Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries

📅 2025-11-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To jointly optimize robustness and communication efficiency in federated learning under Byzantine adversaries—who may arbitrarily corrupt gradients—this paper proposes Byrd-NAFL, the first Byzantine-robust federated learning algorithm integrating Nesterov-accelerated gradients. Byrd-NAFL synergistically combines Byzantine-resilient aggregation (e.g., geometric median) with a momentum correction strategy. Theoretically, it establishes finite-time convergence guarantees for non-convex, smooth loss functions while significantly relaxing stringent assumptions such as gradient boundedness. Empirically, Byrd-NAFL outperforms state-of-the-art methods across diverse Byzantine attacks, achieving faster convergence, higher final accuracy, and superior robustness—without increasing communication overhead.

Technology Category

Application Category

📝 Abstract
We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries capable of arbitrary and potentially malicious behaviors. To simultaneously enhance communication efficiency and robustness against such adversaries, we propose a Byzantine-resilient Nesterov-Accelerated Federated Learning (Byrd-NAFL) algorithm. Byrd-NAFL seamlessly integrates Nesterov's momentum into the federated learning process alongside Byzantine-resilient aggregation rules to achieve fast and safeguarding convergence against gradient corruption. We establish a finite-time convergence guarantee for Byrd-NAFL under non-convex and smooth loss functions with relaxed assumption on the aggregated gradients. Extensive numerical experiments validate the effectiveness of Byrd-NAFL and demonstrate the superiority over existing benchmarks in terms of convergence speed, accuracy, and resilience to diverse Byzantine attack strategies.
Problem

Research questions and friction points this paper is trying to address.

Developing robust federated learning against Byzantine adversaries' malicious behaviors
Enhancing communication efficiency and resilience to gradient corruption attacks
Achieving fast convergence with Byzantine-resilient aggregation under non-convex losses
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates Nesterov's momentum for accelerated convergence
Uses Byzantine-resilient aggregation rules against malicious behaviors
Ensures fast safeguarding convergence under gradient corruption
🔎 Similar Papers
No similar papers found.