Federated Learning Clients Clustering with Adaptation to Data Drifts

📅 2024-11-03

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

In federated learning, dynamic client data distribution shifts—such as label shift, feature shift, and covariate shift—undermine cluster homogeneity, leading to slow convergence and degraded model accuracy. To address this, we propose an adaptive online reclustering framework featuring a novel lightweight dynamic clustering mechanism based on selective label distribution analysis, which preserves statistical consistency while optimizing model performance. The framework incorporates robust defenses against malicious clients and supports multi-level heterogeneous scenarios. Crucially, it operates without access to global data, significantly reducing communication and computational overhead. Extensive experiments demonstrate that our method improves model accuracy by 1.9–5.9 percentage points over state-of-the-art baselines and accelerates convergence to target accuracy by 1.16×–2.61×.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) enables deep learning model training across edge devices and protects user privacy by retaining raw data locally. Data heterogeneity in client distributions slows model convergence and leads to plateauing with reduced precision. Clustered FL solutions address this by grouping clients with statistically similar data and training models for each cluster. However, maintaining consistent client similarity within each group becomes challenging when data drifts occur, significantly impacting model accuracy. In this paper, we introduce Fielding, a clustered FL framework that handles data drifts promptly with low overheads. Fielding detects drifts on all clients and performs selective label distribution-based re-clustering to balance cluster optimality and model performance, remaining robust to malicious clients and varied heterogeneity degrees. Our evaluations show that Fielding improves model final accuracy by 1.9%-5.9% and reaches target accuracies 1.16x-2.61x faster.

Problem

Research questions and friction points this paper is trying to address.

Address client heterogeneity in Federated Learning to improve convergence and accuracy

Handle diverse data drifts in clustered FL without significant overhead

Maintain cluster quality and model performance against malicious clients and heterogeneity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Clustered FL groups clients by similar representations

FIELDING detects and adapts to diverse data drifts

Selective re-clustering balances quality and performance

🔎 Similar Papers

Unveiling Group-Specific Distributed Concept Drift: A Fairness Imperative in Federated Learning