🤖 AI Summary
To address the significant degradation in model utility and fairness caused by gradient clipping and noise injection in DP-SGD, this paper proposes a synergistic optimization framework. It introduces, for the first time, a stepwise decaying noise multiplier and gradient clipping threshold mechanism, and establishes, also for the first time, an analytical privacy budget computation model under truncated concentrated differential privacy (tCDP). This approach overcomes key limitations of existing methods—such as DP-SGD-Global-Adapt—in dynamic gradient norm adaptation and convergence stability. Experiments on MNIST, CIFAR-10, and CIFAR-100 show accuracy improvements of 0.98%, 0.68%, and 4.01%, respectively. Under imbalanced data (MNIST) and Thinwall settings, the privacy cost gap π decreases by up to 89.83% and 60.55%, respectively. The method simultaneously enhances privacy guarantees, model utility, and group fairness.
📝 Abstract
Differentially Private Stochastic Gradient Descent (DP-SGD) has become a widely used technique for safeguarding sensitive information in deep learning applications. Unfortunately, DPSGD's per-sample gradient clipping and uniform noise addition during training can significantly degrade model utility and fairness. We observe that the latest DP-SGD-Global-Adapt's average gradient norm is the same throughout the training. Even when it is integrated with the existing linear decay noise multiplier, it has little or no advantage. Moreover, we notice that its upper clipping threshold increases exponentially towards the end of training, potentially impacting the models convergence. Other algorithms, DP-PSAC, Auto-S, DP-SGD-Global, and DP-F, have utility and fairness that are similar to or worse than DP-SGD, as demonstrated in experiments. To overcome these problems and improve utility and fairness, we developed the DP-SGD-Global-Adapt-V2-S. It has a step-decay noise multiplier and an upper clipping threshold that is also decayed step-wise. DP-SGD-Global-Adapt-V2-S with a privacy budget ($epsilon$) of 1 improves accuracy by 0.9795%, 0.6786%, and 4.0130% in MNIST, CIFAR10, and CIFAR100, respectively. It also reduces the privacy cost gap ($pi$) by 89.8332% and 60.5541% in unbalanced MNIST and Thinwall datasets, respectively. Finally, we develop mathematical expressions to compute the privacy budget using truncated concentrated differential privacy (tCDP) for DP-SGD-Global-Adapt-V2-T and DP-SGD-Global-Adapt-V2-S.