🤖 AI Summary
This paper studies approximate quantile estimation under differential privacy (DP): given a dataset of $n$ real numbers, output estimates for $m$ specified quantile levels while minimizing the maximum rank error. Methodologically, it introduces continual counting techniques into quantile co-randomization—first applying them to this problem—achieving pure $varepsilon$-DP with maximum rank error $O((log b + log^2 m)/varepsilon)$, an improvement over the prior $O(log b cdot log^2 m / varepsilon)$ and significantly reducing dependence on $log m$. It further proposes an $(varepsilon,delta)$-DP mechanism that relaxes assumptions on quantile spacing. Theoretical analysis yields tighter error bounds, and experiments demonstrate substantial performance gains over state-of-the-art methods (e.g., ICML’22), particularly for large-scale quantile estimation (large $m$) and as $delta o 0$.
📝 Abstract
In the approximate quantiles problem, the goal is to output $m$ quantile estimates, the ranks of which are as close as possible to $m$ given quantiles $q_1,dots,q_m$. We present a mechanism for approximate quantiles that satisfies $varepsilon$-differential privacy for a dataset of $n$ real numbers where the ratio between the closest pair of points and the size of the domain is bounded by $b$. As long as the minimum gap between quantiles is large enough, $|q_i-q_{i-1}|geq Omegaleft(frac{mlog(m)log(b)}{nvarepsilon}
ight)$ for all $i$, the maximum rank error of our mechanism is $Oleft(frac{log(b) + log^2(m)}{varepsilon}
ight)$ with high probability. Previously, the best known algorithm under pure DP was due to Kaplan, Schnapp, and Stemmer~(ICML '22), who achieve a bound of $Oleft(log(b)log^2(m)/varepsilon
ight)$, so we save a factor $Omega(min(log(b),log^2(m)))$. Our improvement stems from the use of continual counting techniques to randomize the quantiles in a correlated way. We also present an $(varepsilon,delta)$-differentially private mechanism that relaxes the gap assumption without affecting the error bound, improving on existing methods when $delta$ is sufficiently close to zero. We provide experimental evaluation which confirms that our mechanism performs favorably compared to prior work in practice, in particular when the number of quantiles $m$ is large.