Adaptive Gradient Clipping for Robust Federated Learning

📅 2024-05-23

📈 Citations: 1

✨ Influential: 0

career value

256K/year

🤖 AI Summary

In federated learning, static gradient clipping struggles to simultaneously defend against diverse Byzantine attacks and accommodate data heterogeneity, thereby failing to jointly ensure robustness and convergence. To address this, we propose ARC—the first theoretically provable dynamic adaptive clipping mechanism. ARC dynamically models the clipping threshold in real time based on statistical properties of local gradients, and integrates robust aggregation (e.g., Krum, Median) with rigorous convergence analysis. It provides formal guarantees for both Byzantine resilience and asymptotic convergence acceleration of distributed gradient descent. By overcoming the poor generalizability of static clipping, ARC achieves significant improvements: on benchmarks including CIFAR-10 and CIFAR-100, it attains an average accuracy gain of 12.7% and accelerates convergence by 1.8× under high data heterogeneity and multiple Byzantine attack types.

Technology Category

Application Category

📝 Abstract

Robust federated learning aims to maintain reliable performance despite the presence of adversarial or misbehaving workers. While state-of-the-art (SOTA) robust distributed gradient descent (Robust-DGD) methods were proven theoretically optimal, their empirical success has often relied on pre-aggregation gradient clipping. However, existing static clipping strategies yield inconsistent results: enhancing robustness against some attacks while being ineffective or even detrimental against others. To address this limitation, we propose a principled adaptive clipping strategy, Adaptive Robust Clipping (ARC), which dynamically adjusts clipping thresholds based on the input gradients. We prove that ARC not only preserves the theoretical robustness guarantees of SOTA Robust-DGD methods but also provably improves asymptotic convergence when the model is well-initialized. Extensive experiments on benchmark image classification tasks confirm these theoretical insights, demonstrating that ARC significantly enhances robustness, particularly in highly heterogeneous and adversarial settings.

Problem

Research questions and friction points this paper is trying to address.

Enhancing robustness in federated learning against adversarial workers

Addressing inconsistent performance of static gradient clipping methods

Improving convergence and robustness in heterogeneous adversarial settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive clipping strategy for robust federated learning

Dynamic adjustment of clipping thresholds based on gradients

Preserves robustness guarantees and improves convergence

🔎 Similar Papers

On the Volatility of Shapley-Based Contribution Metrics in Federated Learning