Instance-Optimality for Private KL Distribution Estimation

📅 2025-05-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper investigates instance-optimal estimation of discrete probability distributions under differential privacy, measured by KL divergence—departing from the classical minimax paradigm to focus on individual (rather than worst-case) distributions. We introduce the first formal definition of instance optimality under KL divergence, characterizing adaptive local lower bounds within a distribution-specific neighborhood. To achieve this, we design a differentially private estimator inspired by the Good–Turing framework, integrating privacy mechanisms with localized neighborhood analysis. We further derive an information-theoretic lower bound and prove that our estimator’s upper bound matches it up to constant factors. Theoretically, our estimator attains instance optimality in KL divergence; empirically, it significantly outperforms existing minimax-optimal private estimators across diverse distributions.

Technology Category

Application Category

📝 Abstract
We study the fundamental problem of estimating an unknown discrete distribution $p$ over $d$ symbols, given $n$ i.i.d. samples from the distribution. We are interested in minimizing the KL divergence between the true distribution and the algorithm's estimate. We first construct minimax optimal private estimators. Minimax optimality however fails to shed light on an algorithm's performance on individual (non-worst-case) instances $p$ and simple minimax-optimal DP estimators can have poor empirical performance on real distributions. We then study this problem from an instance-optimality viewpoint, where the algorithm's error on $p$ is compared to the minimum achievable estimation error over a small local neighborhood of $p$. Under natural notions of local neighborhood, we propose algorithms that achieve instance-optimality up to constant factors, with and without a differential privacy constraint. Our upper bounds rely on (private) variants of the Good-Turing estimator. Our lower bounds use additive local neighborhoods that more precisely captures the hardness of distribution estimation in KL divergence, compared to ones considered in prior works.
Problem

Research questions and friction points this paper is trying to address.

Estimating discrete distribution with minimal KL divergence
Achieving instance-optimality under differential privacy constraints
Improving empirical performance beyond minimax-optimal estimators
Innovation

Methods, ideas, or system contributions that make the work stand out.

Private Good-Turing estimator variants
Instance-optimality with local neighborhoods
Minimax optimal private estimators
🔎 Similar Papers
No similar papers found.
J
Jiayuan Ye
National University of Singapore
V
V. Feldman
Apple
Kunal Talwar
Kunal Talwar
Apple Inc
Machine LearningDifferential PrivacyAlgorithms