Robust Federated Learning against Noisy Clients via Masked Optimization

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

In federated learning, severe label noise heterogeneity across clients—especially when high-noise clients dominate—significantly degrades model performance. To address this, we propose MaskedOptim, a two-stage framework: (i) dynamically identifying high-noise clients during local training, and (ii) refining their noisy labels via a differentiable, end-to-end pseudo-label correction mechanism, coupled with geometric median aggregation instead of FedAvg to enhance robustness. Our key contribution is the first noise-aware, two-stage co-optimization paradigm that jointly improves label quality adaptively and strengthens aggregation resilience against label corruption. Extensive experiments on three image and one text dataset demonstrate that MaskedOptim consistently outperforms 16 state-of-the-art baselines, achieving an average 4.2% improvement in global accuracy under noisy settings, while exhibiting strong generalization and robustness.

Technology Category

Application Category

📝 Abstract

In recent years, federated learning (FL) has made significant advance in privacy-sensitive applications. However, it can be hard to ensure that FL participants provide well-annotated data for training. The corresponding annotations from different clients often contain complex label noise at varying levels. This label noise issue has a substantial impact on the performance of the trained models, and clients with greater noise levels can be largely attributed for this degradation. To this end, it is necessary to develop an effective optimization strategy to alleviate the adverse effects of these noisy clients.In this study, we present a two-stage optimization framework, MaskedOptim, to address this intricate label noise problem. The first stage is designed to facilitate the detection of noisy clients with higher label noise rates. The second stage focuses on rectifying the labels of the noisy clients' data through an end-to-end label correction mechanism, aiming to mitigate the negative impacts caused by misinformation within datasets. This is achieved by learning the potential ground-truth labels of the noisy clients' datasets via backpropagation. To further enhance the training robustness, we apply the geometric median based model aggregation instead of the commonly-used vanilla averaged model aggregation. We implement sixteen related methods and conduct evaluations on three image datasets and one text dataset with diverse label noise patterns for a comprehensive comparison. Extensive experimental results indicate that our proposed framework shows its robustness in different scenarios. Additionally, our label correction framework effectively enhances the data quality of the detected noisy clients' local datasets. % Our codes will be open-sourced to facilitate related research communities. Our codes are available via https://github.com/Sprinter1999/MaskedOptim .

Problem

Research questions and friction points this paper is trying to address.

Addresses label noise in federated learning from noisy clients

Detects and corrects noisy client data via two-stage optimization

Enhances model robustness with geometric median aggregation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-stage optimization for noisy clients

End-to-end label correction mechanism

Geometric median based model aggregation

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Research Engineer, Privacy

OpenAI

$380K – $445K • Offers Equity

San Francisco

Research Scientist Intern, Optimization, Privacy and Inference (PhD)