🤖 AI Summary
Differential privacy stochastic gradient descent (DPSGD) suffers from dual utility degradation: privacy-preserving noise injection and gradient clipping-induced bias—often exhibiting a trade-off between the two. This paper proposes MomFilter, the first method to jointly mitigate both effects at zero additional privacy cost by integrating per-sample momentum with a post-processing low-pass filter that consumes no privacy budget. Theoretically, MomFilter achieves a faster convergence rate than standard DPSGD under differential privacy constraints. Empirically, it consistently improves test accuracy by 2.1–4.7% across benchmarks including CIFAR-10, SVHN, and ImageNet. Under identical privacy budgets (ε ≤ 8), MomFilter outperforms state-of-the-art DPSGD variants across all evaluated settings, establishing a superior privacy–utility trade-off.
📝 Abstract
Differentially Private Stochastic Gradient Descent (DPSGD) is widely used to train deep neural networks with formal privacy guarantees. However, the addition of differential privacy (DP) often degrades model accuracy by introducing both noise and bias. Existing techniques typically address only one of these issues, as reducing DP noise can exacerbate clipping bias and vice-versa. In this paper, we propose a novel method, emph{DP-PMLF}, which integrates per-sample momentum with a low-pass filtering strategy to simultaneously mitigate DP noise and clipping bias. Our approach uses per-sample momentum to smooth gradient estimates prior to clipping, thereby reducing sampling variance. It further employs a post-processing low-pass filter to attenuate high-frequency DP noise without consuming additional privacy budget. We provide a theoretical analysis demonstrating an improved convergence rate under rigorous DP guarantees, and our empirical evaluations reveal that DP-PMLF significantly enhances the privacy-utility trade-off compared to several state-of-the-art DPSGD variants.