Byzantine-Robust and Differentially Private Federated Optimization under Weaker Assumptions

πŸ“… 2026-03-24
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge of simultaneously achieving Byzantine robustness and differential privacy in federated learning under weak assumptions. To this end, the authors propose the Byz-Clip21-SGD2M algorithm, which integrates robust aggregation, a double-momentum mechanism, and adaptive gradient clipping. Under only standard smoothness and sub-Gaussian noise assumptions, this method is the first to unify both guarantees while providing high-probability convergence. Theoretical analysis shows that in the absence of attacks, the algorithm recovers the current optimal convergence rate, and under Byzantine attacks with privacy constraints, it significantly improves utility. Empirical evaluations on MNIST using both CNN and MLP architectures validate the effectiveness of the proposed approach.

Technology Category

Application Category

πŸ“ Abstract
Federated Learning (FL) enables heterogeneous clients to collaboratively train a shared model without centralizing their raw data, offering an inherent level of privacy. However, gradients and model updates can still leak sensitive information, while malicious servers may mount adversarial attacks such as Byzantine manipulation. These vulnerabilities highlight the need to address differential privacy (DP) and Byzantine robustness within a unified framework. Existing approaches, however, often rely on unrealistic assumptions such as bounded gradients, require auxiliary server-side datasets, or fail to provide convergence guarantees. We address these limitations by proposing Byz-Clip21-SGD2M, a new algorithm that integrates robust aggregation with double momentum and carefully designed clipping. We prove high-probability convergence guarantees under standard $L$-smoothness and $Οƒ$-sub-Gaussian gradient noise assumptions, thereby relaxing conditions that dominate prior work. Our analysis recovers state-of-the-art convergence rates in the absence of adversaries and improves utility guarantees under Byzantine and DP settings. Empirical evaluations on CNN and MLP models trained on MNIST further validate the effectiveness of our approach.
Problem

Research questions and friction points this paper is trying to address.

Byzantine robustness
differential privacy
federated learning
convergence guarantees
gradient clipping
Innovation

Methods, ideas, or system contributions that make the work stand out.

Byzantine robustness
differential privacy
federated optimization
gradient clipping
convergence guarantees
πŸ”Ž Similar Papers