Trade-off in Estimating the Number of Byzantine Clients in Federated Learning

📅 2025-10-05

📈 Citations: 0

✨ Influential: 0

career value

229K/year

🤖 AI Summary

In federated learning, the number $f$ of Byzantine clients is unknown and must be estimated as $hat{f}$ to select an appropriate robust aggregation rule—yet existing work lacks a systematic theoretical study of the trade-offs introduced by estimation error. Method: We develop a unified analytical framework integrating robust aggregation mechanisms with distributed optimization theory to derive tight upper and lower bounds on the worst-case error under varying $hat{f}$. Contribution/Results: We provide the first theoretical characterization of how $hat{f}$-bias fundamentally impacts convergence: underestimation causes severe degradation in worst-case error, while non-underestimation yields an optimal error bound of $Theta(hat{f}/(n-f-hat{f}))$, revealing an intrinsic robustness–accuracy trade-off. Our analysis establishes both necessity and sufficiency conditions for optimal robust aggregation and provides principled guidance for designing adaptive robust aggregators in practical federated systems.

Technology Category

Application Category

📝 Abstract

Federated learning has attracted increasing attention at recent large-scale optimization and machine learning research and applications, but is also vulnerable to Byzantine clients that can send any erroneous signals. Robust aggregators are commonly used to resist Byzantine clients. This usually requires to estimate the unknown number $f$ of Byzantine clients, and thus accordingly select the aggregators with proper degree of robustness (i.e., the maximum number $hat{f}$ of Byzantine clients allowed by the aggregator). Such an estimation should have important effect on the performance, which has not been systematically studied to our knowledge. This work will fill in the gap by theoretically analyzing the worst-case error of aggregators as well as its induced federated learning algorithm for any cases of $hat{f}$ and $f$. Specifically, we will show that underestimation ($hat{f}<f$) can lead to arbitrarily poor performance for both aggregators and federated learning. For non-underestimation ($hat{f}ge f$), we have proved optimal lower and upper bounds of the same order on the errors of both aggregators and federated learning. All these optimal bounds are proportional to $hat{f}/(n-f-hat{f})$ with $n$ clients, which monotonically increases with larger $hat{f}$. This indicates a fundamental trade-off: while an aggregator with a larger robustness degree $hat{f}$ can solve federated learning problems of wider range $fin [0,hat{f}]$, the performance can deteriorate when there are actually fewer or even no Byzantine clients (i.e., $fin [0,hat{f})$).

Problem

Research questions and friction points this paper is trying to address.

Analyzing worst-case error bounds for Byzantine-robust federated learning

Studying performance trade-offs between robustness and accuracy

Investigating effects of underestimating Byzantine client numbers

Innovation

Methods, ideas, or system contributions that make the work stand out.

Analyzing worst-case error of Byzantine robust aggregators

Proving optimal bounds for non-underestimation scenarios

Revealing trade-off between robustness and performance

🔎 Similar Papers

LoByITFL: Low Communication Secure and Private Federated Learning