Enhancing DPSGD via Per-Sample Momentum and Low-Pass Filtering

📅 2025-11-11

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

Differential privacy stochastic gradient descent (DPSGD) suffers from dual utility degradation: privacy-preserving noise injection and gradient clipping-induced bias—often exhibiting a trade-off between the two. This paper proposes MomFilter, the first method to jointly mitigate both effects at zero additional privacy cost by integrating per-sample momentum with a post-processing low-pass filter that consumes no privacy budget. Theoretically, MomFilter achieves a faster convergence rate than standard DPSGD under differential privacy constraints. Empirically, it consistently improves test accuracy by 2.1–4.7% across benchmarks including CIFAR-10, SVHN, and ImageNet. Under identical privacy budgets (ε ≤ 8), MomFilter outperforms state-of-the-art DPSGD variants across all evaluated settings, establishing a superior privacy–utility trade-off.

Technology Category

Application Category

📝 Abstract

Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to train deep neural networks with formal privacy guarantees. However, the addition of differential privacy (DP) often degrades model accuracy by introducing both noise and bias. Existing techniques typically address only one of these issues, as reducing DP noise can exacerbate clipping bias and vice-versa. In this paper, we propose a novel method, emph{DP-PMLF}, which integrates per-sample momentum with a low-pass filtering strategy to simultaneously mitigate DP noise and clipping bias. Our approach uses per-sample momentum to smooth gradient estimates prior to clipping, thereby reducing sampling variance. It further employs a post-processing low-pass filter to attenuate high-frequency DP noise without consuming additional privacy budget. We provide a theoretical analysis demonstrating an improved convergence rate under rigorous DP guarantees, and our empirical evaluations reveal that DP-PMLF significantly enhances the privacy-utility trade-off compared to several state-of-the-art DPSGD variants.

Problem

Research questions and friction points this paper is trying to address.

Mitigating DP noise and clipping bias simultaneously in DPSGD training

Improving convergence rate under rigorous differential privacy guarantees

Enhancing privacy-utility trade-off for deep neural networks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Per-sample momentum reduces gradient variance

Low-pass filtering attenuates high-frequency DP noise

Simultaneously mitigates DP noise and clipping bias

🔎 Similar Papers

Signal Processing Meets SGD: From Momentum to Filter