🤖 AI Summary
This work addresses the challenge of accurately computing privacy amplification for matrix mechanisms under randomized subsampling in differentially private training. Existing approaches rely on high-probability guarantees, randomized check-in, or extensive sampling to estimate privacy bounds. In contrast, this paper presents the first exact, sampling-free privacy accounting framework that combines Rényi divergence with conditional composition analysis. By leveraging dynamic programming, the method efficiently computes tight, deterministic privacy bounds for arbitrary matrix mechanisms—both banded and non-banded. The approach achieves significantly improved accuracy in regimes with small ε and δ, outperforming current sampling-based techniques across a range of commonly used matrix mechanisms while providing provably sharp privacy guarantees.
📝 Abstract
We study privacy amplification for differentially private model training with matrix factorization under random allocation (also known as the balls-in-bins model). Recent work by Choquette-Choo et al. (2025) proposes a sampling-based Monte Carlo approach to compute amplification parameters in this setting. However, their guarantees either only hold with some high probability or require random abstention by the mechanism. Furthermore, the required number of samples for ensuring $(\epsilon,\delta)$-DP is inversely proportional to $\delta$. In contrast, we develop sampling-free bounds based on R\'enyi divergence and conditional composition. The former is facilitated by a dynamic programming formulation to efficiently compute the bounds. The latter complements it by offering stronger privacy guarantees for small $\epsilon$, where R\'enyi divergence bounds inherently lead to an over-approximation. Our framework applies to arbitrary banded and non-banded matrices. Through numerical comparisons, we demonstrate the efficacy of our approach across a broad range of matrix mechanisms used in research and practice.