๐ค AI Summary
To address the challenge of unsupervised clustering of high-dimensional data in federated learning, this paper proposes FedClusterโthe first federated clustering framework integrating contrastive representation learning. Methodologically, it introduces cluster-level contrastive loss to align global cluster structures across clients, and designs a robust local cluster center alignment mechanism coupled with adaptive aggregation to mitigate client heterogeneity and dropouts. The framework strictly preserves data privacy by confining raw data to local devices, exchanging only model parameters and cluster centroids. Evaluated on multiple benchmark datasets, FedCluster achieves a maximum NMI improvement of 0.4155 and outperforms the best baseline by over 100% in clustering accuracy. Notably, it maintains stable performance even under 30% client dropout, demonstrating strong resilience to client unavailability.
๐ Abstract
Federated clustering, an essential extension of centralized clustering for federated scenarios, enables multiple data-holding clients to collaboratively group data while keeping their data locally. In centralized scenarios, clustering driven by representation learning has made significant advancements in handling high-dimensional complex data. However, the combination of federated clustering and representation learning remains underexplored. To bridge this, we first tailor a cluster-contrastive model for learning clustering-friendly representations. Then, we harness this model as the foundation for proposing a new federated clustering method, named cluster-contrastive federated clustering (CCFC). Benefiting from representation learning, the clustering performance of CCFC even double those of the best baseline methods in some cases. Compared to the most related baseline, the benefit results in substantial NMI score improvements of up to 0.4155 on the most conspicuous case. Moreover, CCFC also shows superior performance in handling device failures from a practical viewpoint.