🤖 AI Summary
This work addresses the challenging setting of differentially private stochastic convex optimization (DP-SCO) where the loss function is not Lipschitz continuous but satisfies the Tsybakov noise condition (TNC) with exponent θ > 1—implying unbounded gradients and only finite k-th-order moment bounds (k ≥ 2). We propose the first DP-SGD variant tailored to TNC, integrating gradient clipping, Gaussian perturbation, and stability analysis under moment constraints. Our analysis unifies both zero-concentrated DP (zCDP) and (ε, δ)-DP frameworks. Crucially, we derive the first privacy–utility bound independent of any Lipschitz constant in this non-Lipschitz, high-noise-decay regime, achieving the high-probability optimal rate $ ilde{O}!left(!left(frac{1}{sqrt{n}} + frac{sqrt{d}}{nvarepsilon}
ight)^{frac{k-1}{k}cdotfrac{ heta}{ heta-1}}!
ight)$. This rate strictly improves upon classical Lipschitz-based bounds and is shown to be tight.
📝 Abstract
We study Stochastic Convex Optimization in the Differential Privacy model (DP-SCO). Unlike previous studies, here we assume the population risk function satisfies the Tsybakov Noise Condition (TNC) with some parameter $θ>1$, where the Lipschitz constant of the loss could be extremely large or even unbounded, but the $ell_2$-norm gradient of the loss has bounded $k$-th moment with $kgeq 2$. For the Lipschitz case with $θgeq 2$, we first propose an $(varepsilon, δ)$-DP algorithm whose utility bound is $Tilde{O}left(left( ilde{r}_{2k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nvarepsilon}))^frac{k-1}{k}
ight)^fracθ{θ-1}
ight)$ in high probability, where $n$ is the sample size, $d$ is the model dimension, and $ ilde{r}_{2k}$ is a term that only depends on the $2k$-th moment of the gradient. It is notable that such an upper bound is independent of the Lipschitz constant. We then extend to the case where
$θgeq arθ> 1$ for some known constant $arθ$. Moreover, when the privacy budget $varepsilon$ is small enough, we show an upper bound of $ ilde{O}left(left( ilde{r}_{k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nvarepsilon}))^frac{k-1}{k}
ight)^fracθ{θ-1}
ight)$ even if the loss function is not Lipschitz. For the lower bound, we show that for any $θgeq 2$, the private minimax rate for $ρ$-zero Concentrated Differential Privacy is lower bounded by $Ωleft(left( ilde{r}_{k}(frac{1}{sqrt{n}}+(frac{sqrt{d}}{nsqrtρ}))^frac{k-1}{k}
ight)^fracθ{θ-1}
ight)$.