🤖 AI Summary
In federated learning, dynamic client data distribution shifts—such as label shift, feature shift, and covariate shift—undermine cluster homogeneity, leading to slow convergence and degraded model accuracy. To address this, we propose an adaptive online reclustering framework featuring a novel lightweight dynamic clustering mechanism based on selective label distribution analysis, which preserves statistical consistency while optimizing model performance. The framework incorporates robust defenses against malicious clients and supports multi-level heterogeneous scenarios. Crucially, it operates without access to global data, significantly reducing communication and computational overhead. Extensive experiments demonstrate that our method improves model accuracy by 1.9–5.9 percentage points over state-of-the-art baselines and accelerates convergence to target accuracy by 1.16×–2.61×.
📝 Abstract
Federated Learning (FL) enables deep learning model training across edge devices and protects user privacy by retaining raw data locally. Data heterogeneity in client distributions slows model convergence and leads to plateauing with reduced precision. Clustered FL solutions address this by grouping clients with statistically similar data and training models for each cluster. However, maintaining consistent client similarity within each group becomes challenging when data drifts occur, significantly impacting model accuracy. In this paper, we introduce Fielding, a clustered FL framework that handles data drifts promptly with low overheads. Fielding detects drifts on all clients and performs selective label distribution-based re-clustering to balance cluster optimality and model performance, remaining robust to malicious clients and varied heterogeneity degrees. Our evaluations show that Fielding improves model final accuracy by 1.9%-5.9% and reaches target accuracies 1.16x-2.61x faster.