🤖 AI Summary
This work addresses the challenge of maintaining and strengthening differential privacy guarantees while releasing an arbitrary number of synthetic data records. By introducing a parameter boundedness assumption and integrating generative model analysis with structured privacy bound derivation, the authors establish—for the first time—that privacy amplification remains valid even under unlimited synthetic data release. This result overcomes the limitations of prior approaches, which were restricted to high-dimensional asymptotic regimes, and substantially tightens the privacy bounds originally proposed by Pierquin et al. (2025). The study thus provides significantly sharper and more practical theoretical guarantees for complex data release mechanisms.
📝 Abstract
We study privacy amplification by synthetic data release, a phenomenon in which differential privacy guarantees are improved by releasing only synthetic data rather than the private generative model itself. Recent work by Pierquin et al. (2025) established the first formal amplification guarantees for a linear generator, but they apply only in asymptotic regimes where the model dimension far exceeds the number of released synthetic records, limiting their practical relevance. In this work, we show a surprising result: under a bounded-parameter assumption, privacy amplification persists even when releasing an unbounded number of synthetic records, thereby improving upon the bounds of Pierquin et al. (2025). Our analysis provides structural insights that may guide the development of tighter privacy guarantees for more complex release mechanisms.