๐ค AI Summary
This work addresses the looseness and high computational cost of existing privacy loss analyses under random check-in sampling, which hinder efficient and accurate privacy accounting. The authors propose a general privacy accounting framework based on Privacy Loss Distributions (PLDs), enabling the first mechanism-agnostic and precise privacy analysis for random check-in sampling. This approach overcomes the limitations of traditional methods that rely on approximations or are tailored to specific mechanisms. Theoretical analysis demonstrates that, under the Gaussian mechanism, random check-in sampling achieves a privacyโutility trade-off that is superior or equivalent to Poisson subsampling, making it particularly well-suited for DP-SGD training. Moreover, the proposed framework significantly improves both the efficiency and accuracy of privacy parameter computation.
๐ Abstract
We consider the privacy amplification properties of a sampling scheme in which a user's data is used in $k$ steps chosen randomly and uniformly from a sequence (or set) of $t$ steps. This sampling scheme has been recently applied in the context of differentially private optimization (Chua et al., 2024a; Choquette-Choo et al., 2025) and communication-efficient high-dimensional private aggregation (Asi et al., 2025), where it was shown to have utility advantages over the standard Poisson sampling. Theoretical analyses of this sampling scheme (Feldman & Shenfeld, 2025; Dong et al., 2025) lead to bounds that are close to those of Poisson sampling, yet still have two significant shortcomings. First, in many practical settings, the resulting privacy parameters are not tight due to the approximation steps in the analysis. Second, the computed parameters are either the hockey stick or Renyi divergence, both of which introduce overheads when used in privacy loss accounting.
In this work, we demonstrate that the privacy loss distribution (PLD) of random allocation applied to any differentially private algorithm can be computed efficiently. When applied to the Gaussian mechanism, our results demonstrate that the privacy-utility trade-off for random allocation is at least as good as that of Poisson subsampling. In particular, random allocation is better suited for training via DP-SGD. To support these computations, our work develops new tools for general privacy loss accounting based on a notion of PLD realization. This notion allows us to extend accurate privacy loss accounting to subsampling which previously required manual noise-mechanism-specific analysis.