🤖 AI Summary
This work addresses a critical limitation in existing differentially private federated learning frameworks, which typically assume uniform privacy budgets across all clients—an assumption that fails to capture the heterogeneous privacy requirements prevalent in real-world settings and undermines conventional data-volume-based client selection strategies. To tackle this challenge, the paper presents the first systematic study of client selection under privacy heterogeneity and introduces a privacy-aware weighted sampling strategy. By formulating a convex optimization problem, the method dynamically adjusts each client’s selection probability to minimize training error while explicitly incorporating individual privacy budgets into the selection mechanism. Theoretical analysis establishes the convergence of the proposed algorithm, and empirical evaluations on benchmark datasets such as CIFAR-10 demonstrate up to a 10% improvement in test accuracy over state-of-the-art approaches.
📝 Abstract
Differentially private federated learning (DP-FL) enables clients to collaboratively train machine learning models while preserving the privacy of their local data. However, most existing DP-FL approaches assume that all clients share a uniform privacy budget, an assumption that does not hold in real-world scenarios where privacy requirements vary widely. This privacy heterogeneity poses a significant challenge: conventional client selection strategies, which typically rely on data quantity, cannot distinguish between clients providing high-quality updates and those introducing substantial noise due to strict privacy constraints. To address this gap, we present the first systematic study of privacy-aware client selection in DP-FL. We establish a theoretical foundation by deriving a convergence analysis that quantifies the impact of privacy heterogeneity on training error. Building on this analysis, we propose a privacy-aware client selection strategy, formulated as a convex optimization problem, that adaptively adjusts selection probabilities to minimize training error. Extensive experiments on benchmark datasets demonstrate that our approach achieves up to a 10% improvement in test accuracy on CIFAR-10 compared to existing baselines under heterogeneous privacy budgets. These results highlight the importance of incorporating privacy heterogeneity into client selection for practical and effective federated learning.