Privacy of SGD under Gaussian or Heavy-Tailed Noise: Guarantees without Gradient Clipping

📅 2024-03-04
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of establishing rigorous differential privacy (DP) guarantees when injecting heavy-tailed noise—specifically, α-stable noise (including infinite-variance cases)—into stochastic gradient descent (SGD). Prior analyses typically rely on gradient clipping, bounded gradients, or convexity assumptions. We provide the first $(epsilon,delta)$-DP guarantee for SGD with α-stable noise under **no gradient clipping**, **no gradient norm boundedness assumption**, and **non-convex loss functions**. Our key theoretical contribution is proving that α-stable noise alone achieves $(0,O(1/n))$-DP, demonstrating that projection or clipping steps are often unnecessary. Furthermore, we unify the privacy–optimization trade-off analysis for both heavy-tailed (e.g., α-stable) and light-tailed (e.g., Gaussian) noise, showing that heavy-tailed noise serves as an effective, theoretically justified alternative to Gaussian noise. These results establish a more general and practical foundation for privacy-preserving optimization in unconstrained settings.

Technology Category

Application Category

📝 Abstract
The injection of heavy-tailed noise into the iterates of stochastic gradient descent (SGD) has garnered growing interest in recent years due to its theoretical and empirical benefits for optimization and generalization. However, its implications for privacy preservation remain largely unexplored. Aiming to bridge this gap, we provide differential privacy (DP) guarantees for noisy SGD, when the injected noise follows an $alpha$-stable distribution, which includes a spectrum of heavy-tailed distributions (with infinite variance) as well as the light-tailed Gaussian distribution. Considering the $(epsilon, delta)$-DP framework, we show that SGD with heavy-tailed perturbations achieves $(0, O(1/n))$-DP for a broad class of loss functions which can be non-convex, where $n$ is the number of data points. As a remarkable byproduct, contrary to prior work that necessitates bounded sensitivity for the gradients or clipping the iterates, our theory can handle unbounded gradients without clipping, and reveals that under mild assumptions, such a projection step is not actually necessary. Our results suggest that, given other benefits of heavy-tails in optimization, heavy-tailed noising schemes can be a viable alternative to their light-tailed counterparts.
Problem

Research questions and friction points this paper is trying to address.

Explores privacy implications of heavy-tailed noise in SGD
Provides differential privacy guarantees for α-stable noise distributions
Handles unbounded gradients without requiring clipping
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses heavy-tailed noise in SGD
Provides differential privacy guarantees
Handles unbounded gradients without clipping
🔎 Similar Papers
No similar papers found.
Umut Simsekli
Umut Simsekli
INRIA - École Normale Supérieure
Deep Learning TheoryLangevin Monte Carlo
M
M. Gürbüzbalaban
Department of Management Science and Information Systems, Rutgers Business School, Piscataway, NJ, USA
S
S. Yıldırım
Faculty of Engineering and Natural Sciences, Sabancı University, Istanbul, Turkey.
Lingjiong Zhu
Lingjiong Zhu
Florida State University