The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

📅 2024-10-08
🏛️ arXiv.org
📈 Citations: 2
Influential: 1
📄 PDF
🤖 AI Summary
This paper addresses the significant gap between theoretical privacy bounds—derived under the assumption of full access to all intermediate gradients—and actual privacy risks in practical deployments of DP-SGD where only the final model is released and intermediate iterates are hidden. To bridge this gap, we propose a heuristic privacy analysis method based on linearized modeling that requires no access to intermediate gradients; instead, it leverages only the final model parameters and the statistical properties of injected noise to accurately predict differential privacy audit outcomes. We systematically demonstrate, for the first time, that releasing only the final iterate substantially reduces real-world privacy leakage, and we quantify the often-orders-of-magnitude gap between conventional theoretical bounds and empirically observed audit lower bounds. Extensive experiments across vision and language tasks confirm that our method predicts the performance of mainstream privacy auditing attacks with high accuracy, while remaining provably bounded by theoretical guarantees—thereby establishing a principled, tight, and practical framework for privacy analysis.

Technology Category

Application Category

📝 Abstract
We propose a simple heuristic privacy analysis of noisy clipped stochastic gradient descent (DP-SGD) in the setting where only the last iterate is released and the intermediate iterates remain hidden. Namely, our heuristic assumes a linear structure for the model. We show experimentally that our heuristic is predictive of the outcome of privacy auditing applied to various training procedures. Thus it can be used prior to training as a rough estimate of the final privacy leakage. We also probe the limitations of our heuristic by providing some artificial counterexamples where it underestimates the privacy leakage. The standard composition-based privacy analysis of DP-SGD effectively assumes that the adversary has access to all intermediate iterates, which is often unrealistic. However, this analysis remains the state of the art in practice. While our heuristic does not replace a rigorous privacy analysis, it illustrates the large gap between the best theoretical upper bounds and the privacy auditing lower bounds and sets a target for further work to improve the theoretical privacy analyses. We also empirically support our heuristic and show existing privacy auditing attacks are bounded by our heuristic analysis in both vision and language tasks.
Problem

Research questions and friction points this paper is trying to address.

Analyzes privacy leakage in DP-SGD with last iterate release.
Proposes heuristic to estimate privacy leakage before training.
Highlights gap between theoretical bounds and empirical privacy auditing.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Heuristic privacy analysis for DP-SGD
Focuses on last iterate privacy leakage
Empirical validation across vision and language tasks
🔎 Similar Papers
No similar papers found.