🤖 AI Summary
In heterogeneous federated learning (FL), performance degradation arises from statistical and computational heterogeneity across clients, while conventional clustering approaches risk privacy leakage through client identification or data similarity inference. To address these challenges, this paper proposes the Anonymous Adaptive Clustering (AAC) framework. AAC introduces a novel client anonymization mechanism based on oblivious shuffle, integrated with probability-driven adaptive clustering and differential privacy–enhanced grouping to jointly conceal both client identities and data similarity. Furthermore, it incorporates an iterative adaptive frequency decay mechanism to dynamically refine cluster structures and improve collaborative training efficiency. Extensive experiments demonstrate that AAC achieves strong privacy guarantees—satisfying rigorous differential privacy—while accelerating model convergence by approximately 7× compared to state-of-the-art heterogeneous FL methods.
📝 Abstract
Federated learning (FL) is a distributed machine learning paradigm enabling multiple clients to train a model collaboratively without exposing their local data. Among FL schemes, clustering is an effective technique addressing the heterogeneity issue (i.e., differences in data distribution and computational ability affect training performance and effectiveness) via grouping participants with similar computational resources or data distribution into clusters. However, intra-cluster data exchange poses privacy risks, while cluster selection and adaptation introduce challenges that may affect overall performance. To address these challenges, this paper introduces anonymous adaptive clustering, a novel approach that simultaneously enhances privacy protection and boosts training efficiency. Specifically, an oblivious shuffle-based anonymization method is designed to safeguard user identities and prevent the aggregation server from inferring similarities through clustering. Additionally, to improve performance, we introduce an iteration-based adaptive frequency decay strategy, which leverages variability in clustering probabilities to optimize training dynamics. With these techniques, we build the FedCAPrivacy; experiments show that FedCAPrivacy achieves ~7X improvement in terms of performance while maintaining high privacy.