🤖 AI Summary
This work addresses the privacy amplification effect of truncated Poisson sampling—where batches exceeding a capacity threshold are truncated—in differentially private learning. Unlike conventional analyses assuming ideal Poisson sampling, this paper provides the first rigorous probabilistic modeling and privacy analysis for the truncated setting, establishing a theoretical framework aligned with practical training scenarios (e.g., gradient clipping combined with batch truncation). Methodologically, it precisely characterizes the sampling distribution shift induced by truncation and derives a tighter upper bound on the Rényi differential privacy (RDP) loss. The resulting bound strictly improves upon existing RDP bounds under identical privacy budgets. Empirically, experiments on standard benchmarks validate both the tightness of the theoretical bound and the improved model utility—demonstrating significant gains in accuracy or convergence speed without compromising privacy guarantees.
📝 Abstract
We give a new privacy amplification analysis for truncated Poisson sampling, a Poisson sampling variant that truncates a batch if it exceeds a given maximum batch size.