Reveal-or-Obscure: A Differentially Private Sampling Algorithm for Discrete Distributions

📅 2025-04-20

📈 Citations: 0

✨ Influential: 0

career value

257K/year

🤖 AI Summary

This work addresses the problem of generating a single sample from an unknown discrete distribution under differential privacy (DP) constraints. Existing methods suffer from high sampling complexity and limited utility due to explicit noise injection. To overcome these limitations, we propose the reveal-or-obscure (ROO) algorithm, which achieves ε-DP by randomly deciding whether to “reveal” the empirical distribution—bypassing explicit perturbation. We prove that ROO attains a strictly improved sampling complexity over prior state-of-the-art bounds. Furthermore, we introduce DS-ROO, a data-adaptive variant that dynamically adjusts the obscuring probability to enhance sample fidelity under the same privacy budget. Experiments demonstrate that DS-ROO significantly outperforms both ROO and baseline approaches in utility, yielding a substantial improvement in the privacy–utility trade-off.

Technology Category

Application Category

📝 Abstract

We introduce a differentially private (DP) algorithm called reveal-or-obscure (ROO) to generate a single representative sample from a dataset of $n$ observations drawn i.i.d. from an unknown discrete distribution $P$. Unlike methods that add explicit noise to the estimated empirical distribution, ROO achieves $epsilon$-differential privacy by randomly choosing whether to"reveal"or"obscure"the empirical distribution. While ROO is structurally identical to Algorithm 1 proposed by Cheu and Nayak (arXiv:2412.10512), we prove a strictly better bound on the sampling complexity than that established in Theorem 12 of (arXiv:2412.10512). To further improve the privacy-utility trade-off, we propose a novel generalized sampling algorithm called Data-Specific ROO (DS-ROO), where the probability of obscuring the empirical distribution of the dataset is chosen adaptively. We prove that DS-ROO satisfies $epsilon$-DP, and provide empirical evidence that DS-ROO can achieve better utility under the same privacy budget of vanilla ROO.

Problem

Research questions and friction points this paper is trying to address.

Generate private sample from unknown discrete distribution

Improve privacy-utility trade-off in sampling

Achieve differential privacy without explicit noise

Innovation

Methods, ideas, or system contributions that make the work stand out.

DP algorithm ROO for discrete distributions

Random reveal-obscure mechanism for privacy

Data-Specific ROO adapts obscuring probability

🔎 Similar Papers

Differentially Private Block-wise Gradient Shuffle for Deep Learning