Dynamic Clustering for Personalized Federated Learning on Heterogeneous Edge Devices

📅 2025-08-03

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

To address performance degradation of the global model caused by highly heterogeneous client data in federated learning, this paper proposes Dynamic Clustering Personalized Federated Learning (DC-PFL). Its core contributions are: (1) a lightweight data heterogeneity metric based on model weight divergence; (2) an adaptive dynamic clustering mechanism leveraging training loss evolution trends to enable online client grouping optimization; and (3) a hierarchical aggregation strategy that preserves model personalization while reducing communication overhead. Extensive experiments on multiple heterogeneous benchmark datasets demonstrate that DC-PFL achieves average accuracy improvements of 2.3–5.7% over state-of-the-art baselines, accelerates convergence by 38%, and reduces required communication rounds by 29%. The method thus effectively balances model performance, training efficiency, and communication cost.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) enables edge devices to collaboratively learn a global model, but it may not perform well when clients have high data heterogeneity. In this paper, we propose a dynamic clustering algorithm for personalized federated learning system (DC-PFL) to address the problem of data heterogeneity. DC-PFL starts with all clients training a global model and gradually groups the clients into smaller clusters for model personalization based on their data similarities. To address the challenge of estimating data heterogeneity without exposing raw data, we introduce a discrepancy metric called model discrepancy, which approximates data heterogeneity solely based on the model weights received by the server. We demonstrate that model discrepancy is strongly and positively correlated with data heterogeneity and can serve as a reliable indicator of data heterogeneity. To determine when and how to change grouping structures, we propose an algorithm based on the rapid decrease period of the training loss curve. Moreover, we propose a layer-wise aggregation mechanism that aggregates the low-discrepancy layers at a lower frequency to reduce the amount of transmitted data and communication costs. We conduct extensive experiments on various datasets to evaluate our proposed algorithm, and our results show that DC-PFL significantly reduces total training time and improves model accuracy compared to baselines.

Problem

Research questions and friction points this paper is trying to address.

Address data heterogeneity in federated learning

Cluster clients dynamically for personalized models

Reduce communication costs with layer-wise aggregation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic clustering for personalized federated learning

Model discrepancy metric for data heterogeneity

Layer-wise aggregation to reduce communication costs

🔎 Similar Papers

No similar papers found.