🤖 AI Summary
This work studies differentially private (DP) stochastic convex-concave saddle-point problems under ℓ₁-norm constraints, aiming for dimension-independent convergence in expected duality gap. We propose the first (ε,δ)-DP algorithm based on stochastic mirror descent, integrating an extended Maurey sparsification lemma with a bias-reduced gradient estimator. Under general convex-concave (non-bilinear) settings, it achieves a convergence rate of O(√(log d / n) + (log^{3/2} d / (nε))^{1/3}); under second-order smoothness, it attains O(√(log d / n) + log d / √(nε)) with high probability—nearly matching known lower bounds. This is the first result extending near-dimension-independence to non-bilinear saddle-point problems. Moreover, we derive the first ℓ₁-constrained DP stochastic convex optimization algorithm that accelerates without relying on Frank–Wolfe steps, achieving optimal excess risk rate.
📝 Abstract
We study the problem of differentially-private (DP) stochastic (convex-concave) saddle-points in the $ell_1$ setting. We propose $(varepsilon, delta)$-DP algorithms based on stochastic mirror descent that attain nearly dimension-independent convergence rates for the expected duality gap, a type of guarantee that was known before only for bilinear objectives. For convex-concave and first-order-smooth stochastic objectives, our algorithms attain a rate of $sqrt{log(d)/n} + (log(d)^{3/2}/[nvarepsilon])^{1/3}$, where $d$ is the dimension of the problem and $n$ the dataset size. Under an additional second-order-smoothness assumption, we show that the duality gap is bounded by $sqrt{log(d)/n} + log(d)/sqrt{nvarepsilon}$ with high probability, by using bias-reduced gradient estimators. This rate provides evidence of the near-optimality of our approach, since a lower bound of $sqrt{log(d)/n} + log(d)^{3/4}/sqrt{nvarepsilon}$ exists. Finally, we show that combining our methods with acceleration techniques from online learning leads to the first algorithm for DP Stochastic Convex Optimization in the $ell_1$ setting that is not based on Frank-Wolfe methods. For convex and first-order-smooth stochastic objectives, our algorithms attain an excess risk of $sqrt{log(d)/n} + log(d)^{7/10}/[nvarepsilon]^{2/5}$, and when additionally assuming second-order-smoothness, we improve the rate to $sqrt{log(d)/n} + log(d)/sqrt{nvarepsilon}$. Instrumental to all of these results are various extensions of the classical Maurey Sparsification Lemma cite{Pisier:1980}, which may be of independent interest.