๐ค AI Summary
This paper investigates uniformity and identity testing under user-level local differential privacy (LDP), where each user contributes multiple independent samples from an unknown distribution, subject to stringent privacy constraints. Addressing the absence of foundational testing theory for this setting, we propose the first unified testing framework for user-level LDP: leveraging a symmetric private-coin protocol, we design sampling and testing algorithms that require no public randomness, global identifiers, or trusted central servers, while achieving near-optimal statistical performance. Our theoretical analysis establishes that the proposed methods attain sample complexity approaching the information-theoretic lower bound, ensuring optimal statistical power under strong privacy guarantees. This work fills a critical theoretical gap in distribution testing under user-level LDP and provides a lightweight, deployable statistical testing tool for privacy-preserving data publishing and federated analytics.
๐ Abstract
We initiate the study of distribution testing under emph{user-level} local differential privacy, where each of $n$ users contributes $m$ samples from the unknown underlying distribution. This setting, albeit very natural, is significantly more challenging that the usual locally private setting, as for the same parameter $varepsilon$ the privacy guarantee must now apply to a full batch of $m$ data points. While some recent work consider distribution emph{learning} in this user-level setting, nothing was known for even the most fundamental testing task, uniformity testing (and its generalization, identity testing).
We address this gap, by providing (nearly) sample-optimal user-level LDP algorithms for uniformity and identity testing. Motivated by practical considerations, our main focus is on the private-coin, symmetric setting, which does not require users to share a common random seed nor to have been assigned a globally unique identifier.