Enhancing DPSGD via Per-Sample Momentum and Low-Pass Filtering

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Differential privacy stochastic gradient descent (DPSGD) suffers from dual utility degradation: privacy-preserving noise injection and gradient clipping-induced bias—often exhibiting a trade-off between the two. This paper proposes MomFilter, the first method to jointly mitigate both effects at zero additional privacy cost by integrating per-sample momentum with a post-processing low-pass filter that consumes no privacy budget. Theoretically, MomFilter achieves a faster convergence rate than standard DPSGD under differential privacy constraints. Empirically, it consistently improves test accuracy by 2.1–4.7% across benchmarks including CIFAR-10, SVHN, and ImageNet. Under identical privacy budgets (ε ≤ 8), MomFilter outperforms state-of-the-art DPSGD variants across all evaluated settings, establishing a superior privacy–utility trade-off.

Technology Category

Application Category

📝 Abstract
Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to train deep neural networks with formal privacy guarantees. However, the addition of differential privacy (DP) often degrades model accuracy by introducing both noise and bias. Existing techniques typically address only one of these issues, as reducing DP noise can exacerbate clipping bias and vice-versa. In this paper, we propose a novel method, emph{DP-PMLF}, which integrates per-sample momentum with a low-pass filtering strategy to simultaneously mitigate DP noise and clipping bias. Our approach uses per-sample momentum to smooth gradient estimates prior to clipping, thereby reducing sampling variance. It further employs a post-processing low-pass filter to attenuate high-frequency DP noise without consuming additional privacy budget. We provide a theoretical analysis demonstrating an improved convergence rate under rigorous DP guarantees, and our empirical evaluations reveal that DP-PMLF significantly enhances the privacy-utility trade-off compared to several state-of-the-art DPSGD variants.
Problem

Research questions and friction points this paper is trying to address.

Mitigating DP noise and clipping bias simultaneously in DPSGD training
Improving convergence rate under rigorous differential privacy guarantees
Enhancing privacy-utility trade-off for deep neural networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Per-sample momentum reduces gradient variance
Low-pass filtering attenuates high-frequency DP noise
Simultaneously mitigates DP noise and clipping bias
🔎 Similar Papers
No similar papers found.
X
Xincheng Xu
School of Computing, Australian National University
T
Thilina Ranbaduge
Data 61, CSIRO
Q
Qing Wang
School of Computing, Australian National University
Thierry Rakotoarivelo
Thierry Rakotoarivelo
Principal Research Scientist, Data61, CSIRO
Data PrivacyPrivacy Risk AssessmentsMachine LearningPrivacy-Enhancing Technologies
D
David Smith
Data 61, CSIRO