🤖 AI Summary
This work addresses two key limitations of differentially private shuffling-based gradient descent (DP-ShuffleG) for convex empirical risk minimization (ERM): (i) its theoretical privacy–accuracy trade-off remains poorly characterized, and (ii) its empirical excess risk consistently exceeds that of DP-SGD. We first establish a tight upper bound on the excess risk of DP-ShuffleG by rigorously integrating Privacy Amplification by Iteration (PABI) with Stein’s lemma, thereby identifying the root cause of its performance degradation. Building on this analysis, we propose Interleaved-ShuffleG—a novel alternating optimization framework featuring a surrogate objective, adaptive Gaussian noise injection, and a differential measurement mechanism—designed to improve gradient estimation quality while preserving the same privacy budget. We provide theoretical guarantees showing that Interleaved-ShuffleG significantly reduces excess risk. Empirical evaluation across multiple benchmark datasets demonstrates consistent superiority over DP-SGD and state-of-the-art private optimization methods, effectively bridging the gap between theory and practice.
📝 Abstract
We consider the problem of differentially private (DP) convex empirical risk minimization (ERM). While the standard DP-SGD algorithm is theoretically well-established, practical implementations often rely on shuffled gradient methods that traverse the training data sequentially rather than sampling with replacement in each iteration. Despite their widespread use, the theoretical privacy-accuracy trade-offs of private shuffled gradient methods ( extit{DP-ShuffleG}) remain poorly understood, leading to a gap between theory and practice. In this work, we leverage privacy amplification by iteration (PABI) and a novel application of Stein's lemma to provide the first empirical excess risk bound of extit{DP-ShuffleG}. Our result shows that data shuffling results in worse empirical excess risk for extit{DP-ShuffleG} compared to DP-SGD. To address this limitation, we propose extit{Interleaved-ShuffleG}, a hybrid approach that integrates public data samples in private optimization. By alternating optimization steps that use private and public samples, extit{Interleaved-ShuffleG} effectively reduces empirical excess risk. Our analysis introduces a new optimization framework with surrogate objectives, adaptive noise injection, and a dissimilarity metric, which can be of independent interest. Our experiments on diverse datasets and tasks demonstrate the superiority of extit{Interleaved-ShuffleG} over several baselines.