🤖 AI Summary
To address privacy leakage during hyperparameter tuning on sensitive data in distributed machine learning, this paper proposes a client-level differentially private hyperparameter search method. The approach introduces a distributed voting mechanism based on local evaluations, modeling hyperparameter selection as a noisy consensus process; its privacy guarantee is independent of the number of hyperparameters, thereby substantially alleviating the utility–privacy trade-off. By modularly integrating differential privacy noise injection and local evaluation within the Flower framework, the method natively supports both IID and non-IID data distributions. Experimental results demonstrate that, under stringent privacy budgets (ε ≤ 2), the method efficiently converges to high-quality hyperparameter configurations endorsed by the majority of clients, achieving model performance close to the non-private baseline. The method is scalable, generalizable across diverse federated learning settings, and practically deployable.
📝 Abstract
The tuning of hyperparameters in distributed machine learning can substantially impact model performance. When the hyperparameters are tuned on sensitive data, privacy becomes an important challenge and to this end, differential privacy has emerged as the de facto standard for provable privacy. A standard setting when performing distributed learning tasks is that clients agree on a shared setup, i.e., find a compromise from a set of hyperparameters, like the learning rate of the model to be trained. Yet, prior work on differentially private hyperparameter tuning either uses computationally expensive cryptographic protocols, determines hyperparameters separately for each client, or applies differential privacy locally, which can lead to undesirable utility-privacy trade-offs.
In this work, we present our algorithm DP-HYPE, which performs a distributed and privacy-preserving hyperparameter search by conducting a distributed voting based on local hyperparameter evaluations of clients. In this way, DP-HYPE selects hyperparameters that lead to a compromise supported by the majority of clients, while maintaining scalability and independence from specific learning tasks. We prove that DP-HYPE preserves the strong notion of differential privacy called client-level differential privacy and, importantly, show that its privacy guarantees do not depend on the number of hyperparameters. We also provide bounds on its utility guarantees, that is, the probability of reaching a compromise, and implement DP-HYPE as a submodule in the popular Flower framework for distributed machine learning. In addition, we evaluate performance on multiple benchmark data sets in iid as well as multiple non-iid settings and demonstrate high utility of DP-HYPE even under small privacy budgets.