🤖 AI Summary
To address the high communication overhead and the challenge of client grouping under non-IID data in federated learning, this paper proposes a dynamic, self-adaptive client clustering and selection mechanism. Departing from conventional fixed-cluster-number assumptions, our method introduces, for the first time, a K-means–based adaptive clustering framework that jointly leverages client similarity metrics and a data-driven cluster-number evolution algorithm to optimize the number of clusters online. This enables automatic selection of the most representative clients while preserving model accuracy, thereby substantially reducing communication load. Experiments on a Non-IID handwritten digit recognition benchmark demonstrate nearly 50% reduction in communication cost, with test accuracy matching that of full-client participation—achieving a Pareto-optimal trade-off between communication efficiency and model performance.
📝 Abstract
Federated learning is a novel decentralized learning architecture. During the training process, the client and server must continuously upload and receive model parameters, which consumes a lot of network transmission resources. Some methods use clustering to find more representative customers, select only a part of them for training, and at the same time ensure the accuracy of training. However, in federated learning, it is not trivial to know what the number of clusters can bring the best training result. Therefore, we propose to dynamically adjust the number of clusters to find the most ideal grouping results. It may reduce the number of users participating in the training to achieve the effect of reducing communication costs without affecting the model performance. We verify its experimental results on the non-IID handwritten digit recognition dataset and reduce the cost of communication and transmission by almost 50% compared with traditional federated learning without affecting the accuracy of the model.