๐ค AI Summary
This paper studies differentially private (DP) nonconvex-strongly-concave minimax optimization, motivated by applications including deep AUC maximization, generative adversarial networks (GANs), and temporal difference learning. We establish the first tight DP convergence lower bound for this setting. To match this bound, we propose a variance-reduced private stochastic gradient descent-ascent (SGDA) algorithm, incorporating gradient clipping and Gaussian noise injection. Our method achieves a gradient norm convergence rate of $ ilde{O}(d^{1/3}/(nepsilon)^{2/3})$, matching the optimal rate for nonconvex empirical risk minimization under DP constraints. We provide rigorous theoretical analysis proving both convergence and optimality of the privacyโutility trade-off. Extensive experiments on benchmark tasks validate the efficacy and practicality of the proposed approach.
๐ Abstract
In this paper, we study the problem of (finite sum) minimax optimization in the Differential Privacy (DP) model. Unlike most of the previous studies on the (strongly) convex-concave settings or loss functions satisfying the Polyak-Lojasiewicz condition, here we mainly focus on the nonconvex-strongly-concave one, which encapsulates many models in deep learning such as deep AUC maximization. Specifically, we first analyze a DP version of Stochastic Gradient Descent Ascent (SGDA) and show that it is possible to get a DP estimator whose $l_2$-norm of the gradient for the empirical risk function is upper bounded by $ ilde{O}(frac{d^{1/4}}{({nepsilon})^{1/2}})$, where $d$ is the model dimension and $n$ is the sample size. We then propose a new method with less gradient noise variance and improve the upper bound to $ ilde{O}(frac{d^{1/3}}{(nepsilon)^{2/3}})$, which matches the best-known result for DP Empirical Risk Minimization with non-convex loss. We also discussed several lower bounds of private minimax optimization. Finally, experiments on AUC maximization, generative adversarial networks, and temporal difference learning with real-world data support our theoretical analysis.