Better Private Distribution Testing by Leveraging Unverified Auxiliary Data

📅 2025-03-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work studies distribution testing under differential privacy, leveraging unreliable yet publicly available auxiliary prior information to improve statistical efficiency—specifically for uniformity, identity, and closeness testing. We propose the first differentially private *enhanced* distribution testing framework and design optimal, adaptive algorithms whose sample complexity smoothly decreases with the quality of the auxiliary prior, achieving (up to logarithmic factors) minimax-optimal sample complexity under given privacy budgets. We further establish matching information-theoretic lower bounds. Theoretically and empirically, our approach significantly reduces the privacy cost when high-quality auxiliary information is available, marking the first method that jointly optimizes auxiliary prior utilization and differential privacy constraints.

Technology Category

Application Category

📝 Abstract
We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data analyst must perform hypothesis testing tasks on sensitive data, but is able to leverage prior knowledge (public, but possibly erroneous or untrusted) about the data distribution. We design private algorithms in this augmented setting for three flagship distribution testing tasks, uniformity, identity, and closeness testing, whose sample complexity smoothly scales with the claimed quality of the auxiliary information. We complement our algorithms with information-theoretic lower bounds, showing that their sample complexity is optimal (up to logarithmic factors).
Problem

Research questions and friction points this paper is trying to address.

Extend augmented distribution testing to differentially private setting.
Design private algorithms for uniformity, identity, and closeness testing.
Provide optimal sample complexity with information-theoretic lower bounds.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends augmented distribution testing framework
Designs private algorithms for sensitive data
Optimal sample complexity with auxiliary data
🔎 Similar Papers
No similar papers found.