Minimax optimal differentially private synthetic data for smooth queries

πŸ“… 2026-02-02
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the problem of generating theoretically optimal synthetic data for k-smooth queries over the hypercube under (Ξ΅, Ξ΄)-differential privacy. By extending the Chebyshev moment matching framework and integrating function approximation under high-order derivative constraints with privacy mechanism design, we propose a polynomial-time algorithm that applies to all k-smooth queries. We establish, for the first time, the minimax lower bound in this setting, revealing a phase transition in the error rate at k = d. Our method achieves an error rate of n^{βˆ’min{1, k/d}} (up to logarithmic factors), which strictly improves upon existing approaches and represents a significant advance in both utility and theoretical optimality.

Technology Category

Application Category

πŸ“ Abstract
Differentially private synthetic data enables the sharing and analysis of sensitive datasets while providing rigorous privacy guarantees for individual contributors. A central challenge is to achieve strong utility guarantees for meaningful downstream analysis. Many existing methods ensure uniform accuracy over broad query classes, such as all Lipschitz functions, but this level of generality often leads to suboptimal rates for statistics of practical interest. Since many common data analysis queries exhibit smoothness beyond what worst-case Lipschitz bounds capture, we ask whether exploiting this additional structure can yield improved utility. We study the problem of generating $(\varepsilon,\delta)$-differentially private synthetic data from a dataset of size $n$ supported on the hypercube $[-1,1]^d$, with utility guarantees uniformly for all smooth queries having bounded derivatives up to order $k$. We propose a polynomial-time algorithm that achieves a minimax error rate of $n^{-\min \{1, \frac{k}{d}\}}$, up to a $\log(n)$ factor. This characterization uncovers a phase transition at $k=d$. Our results generalize the Chebyshev moment matching framework of (Musco et al., 2025; Wang et al., 2016) and strictly improve the error rates for $k$-smooth queries established in (Wang et al., 2016). Moreover, we establish the first minimax lower bound for the utility of $(\varepsilon,\delta)$-differentially private synthetic data with respect to $k$-smooth queries, extending the Wasserstein lower bound for $\varepsilon$-differential privacy in (Boedihardjo et al., 2024).
Problem

Research questions and friction points this paper is trying to address.

differentially private synthetic data
smooth queries
minimax error
utility guarantees
privacy-preserving data analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

differentially private synthetic data
smooth queries
minimax optimality
Chebyshev moment matching
phase transition
πŸ”Ž Similar Papers
No similar papers found.