🤖 AI Summary
This paper addresses the challenge of controlling relative error in differentially private synthetic data publication. It proposes the PREM framework, which—under (ε,δ)-differential privacy—achieves the first (1±ζ) relative error guarantee for arbitrary query families ℱ. Methodologically, PREM integrates private optimization via multiplicative weights update (MWU), sensitivity tuning, Gaussian/Laplace noise injection, and adaptive query selection. Its key contribution is breaking classical lower-bound barriers: it reduces additive error to poly(log|ℱ|, log|𝒳|, log n, log(1/δ), 1/ε, 1/ζ), depending only on logarithmic parameters—not on the原始 domain size |𝒳| or query family cardinality |ℱ|. Theoretical analysis establishes a nearly tight lower bound that matches this bound, demonstrating significant improvements in practicality and accuracy—particularly in high-dimensional, sparse settings.
📝 Abstract
We introduce $mathsf{PREM}$ (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that achieves a relative error guarantee for statistical queries under $(varepsilon, delta)$ differential privacy (DP). Namely, for a domain ${cal X}$, a family ${cal F}$ of queries $f : {cal X} o {0, 1}$, and $zeta>0$, our framework yields a mechanism that on input dataset $D in {cal X}^n$ outputs a synthetic dataset $widehat{D} in {cal X}^n$ such that all statistical queries in ${cal F}$ on $D$, namely $sum_{x in D} f(x)$ for $f in {cal F}$, are within a $1 pm zeta$ multiplicative factor of the corresponding value on $widehat{D}$ up to an additive error that is polynomial in $log |{cal F}|$, $log |{cal X}|$, $log n$, $log(1/delta)$, $1/varepsilon$, and $1/zeta$. In contrast, any $(varepsilon, delta)$-DP mechanism is known to require worst-case additive error that is polynomial in at least one of $n, |{cal F}|$, or $|{cal X}|$. We complement our algorithm with nearly matching lower bounds.