🤖 AI Summary
Federated clustering faces a severe utility degradation under client-level local differential privacy (LDP), hindering its effectiveness for personalized model training. To address this, we propose RR-Cluster, a novel method that introduces a randomized re-balancing mechanism for cluster assignment. By dynamically enforcing a minimum client count per cluster while preserving strict client-level LDP guarantees, RR-Cluster significantly reduces the variance of privacy noise and alleviates the inherent tension between clustering bias and privacy protection. We provide theoretical convergence analysis and demonstrate that RR-Cluster is plug-and-play compatible with mainstream federated clustering frameworks. Extensive experiments on both synthetic and real-world datasets consistently show improved privacy–utility trade-offs. RR-Cluster substantially enhances the performance of multiple strong baseline clustering algorithms—including FedEM, IFCA, and FCL—under stringent LDP constraints.
📝 Abstract
Federated clustering aims to group similar clients into clusters and produce one model for each cluster. Such a personalization approach typically improves model performance compared with training a single model to serve all clients, but can be more vulnerable to privacy leakage. Directly applying client-level differentially private (DP) mechanisms to federated clustering could degrade the utilities significantly. We identify that such deficiencies are mainly due to the difficulties of averaging privacy noise within each cluster (following standard privacy mechanisms), as the number of clients assigned to the same clusters is uncontrolled. To this end, we propose a simple and effective technique, named RR-Cluster, that can be viewed as a light-weight add-on to many federated clustering algorithms. RR-Cluster achieves reduced privacy noise via randomly rebalancing cluster assignments, guaranteeing a minimum number of clients assigned to each cluster. We analyze the tradeoffs between decreased privacy noise variance and potentially increased bias from incorrect assignments and provide convergence bounds for RR-Clsuter. Empirically, we demonstrate the RR-Cluster plugged into strong federated clustering algorithms results in significantly improved privacy/utility tradeoffs across both synthetic and real-world datasets.