FNBench: Benchmarking Robust Federated Learning against Noisy Labels

📅 2025-05-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning (FL), severe and heterogeneous label noise across clients critically undermines model robustness, yet no unified, systematic benchmark or effective noise-resilient methods exist. To address this, we introduce FNBench—the first comprehensive benchmark for evaluating FL robustness under label noise—covering three realistic noise patterns: synthetic noise, human annotation errors, and systematic biases, uniformly assessed across six cross-modal datasets and 18 state-of-the-art FL methods. We establish the first reproducible noise modeling paradigm for FL and propose Representation-Aware Regularization (RAR), a novel technique that enhances feature discriminability and noise resilience. Furthermore, we uncover the intrinsic mechanisms by which label noise degrades FL performance: through gradient direction distortion and representation space degeneration. Experiments demonstrate that RAR consistently improves generalization across diverse FL algorithms. All code, noise configurations, and evaluation toolchains are publicly released.

Technology Category

Application Category

📝 Abstract
Robustness to label noise within data is a significant challenge in federated learning (FL). From the data-centric perspective, the data quality of distributed datasets can not be guaranteed since annotations of different clients contain complicated label noise of varying degrees, which causes the performance degradation. There have been some early attempts to tackle noisy labels in FL. However, there exists a lack of benchmark studies on comprehensively evaluating their practical performance under unified settings. To this end, we propose the first benchmark study FNBench to provide an experimental investigation which considers three diverse label noise patterns covering synthetic label noise, imperfect human-annotation errors and systematic errors. Our evaluation incorporates eighteen state-of-the-art methods over five image recognition datasets and one text classification dataset. Meanwhile, we provide observations to understand why noisy labels impair FL, and additionally exploit a representation-aware regularization method to enhance the robustness of existing methods against noisy labels based on our observations. Finally, we discuss the limitations of this work and propose three-fold future directions. To facilitate related communities, our source code is open-sourced at https://github.com/Sprinter1999/FNBench.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking FL robustness against diverse label noise patterns
Evaluating 18 methods on image and text datasets
Proposing regularization to enhance FL noise resilience
Innovation

Methods, ideas, or system contributions that make the work stand out.

FNBench benchmarks FL robustness to noisy labels
Evaluates 18 methods across diverse noise patterns
Proposes representation-aware regularization for robustness
🔎 Similar Papers
Xuefeng Jiang
Xuefeng Jiang
Institute of Computing Technology, Chinese Academy of Sciences
Weakly-supervised LearningDistributed OptimizationAutonomous DrivingNoisy Label Learning
J
Jia Li
Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China, and also with the University of Chinese Academy of Sciences, Beijing, China
Nannan Wu
Nannan Wu
Huazhong University of Science and Technology, Wuhan, Hubei province, China
Z
Zhiyuan Wu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, and also with the University of Chinese Academy of Sciences, Beijing, China
X
Xujing Li
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, and also with the University of Chinese Academy of Sciences, Beijing, China
S
Sheng Sun
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Y
Yuwei Wang
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
G
Gang Xu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China
Q
Qi Li
Institution for Network Sciences and Cyberspace, Tsinghua University, Beijing, China
M
Min Liu
Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, and also with the Zhongguancun Laboratory, Beijing, China