🤖 AI Summary
Addressing the challenge of simultaneously achieving privacy, accuracy, and scalability in distributed multi-quantile estimation, this paper proposes an end-to-end differentially private scheme under a two-server architecture. The scheme operates without a trusted central server and provides security guarantees against malicious adversaries. It introduces a novel intermediate statistic release mechanism that synergistically integrates local differential privacy (LDP) with secure multi-party computation (MPC), substantially reducing communication and computational overhead. On million-scale datasets, it estimates five quantiles within one minute, achieving accuracy four orders of magnitude higher than pure LDP and runtime approximately ten times faster than baseline MPC approaches. The core contribution is the first framework enabling high-accuracy, cryptographically strong privacy, and scalable collaborative multi-quantile analysis in distributed settings.
📝 Abstract
Quantiles are key in distributed analytics, but computing them over sensitive data risks privacy. Local differential privacy (LDP) offers strong protection but lower accuracy than central DP, which assumes a trusted aggregator. Secure multi-party computation (MPC) can bridge this gap, but generic MPC solutions face scalability challenges due to large domains, complex secure operations, and multi-round interactions.
We present Piquant$varepsilon$, a system for privacy-preserving estimation of multiple quantiles in a distributed setting without relying on a trusted server. Piquant$varepsilon$ operates under the malicious threat model and achieves accuracy of the central DP model. Built on the two-server model, Piquant$varepsilon$ uses a novel strategy of releasing carefully chosen intermediate statistics, reducing MPC complexity while preserving end-to-end DP. Empirically, Piquant$varepsilon$ estimates 5 quantiles on 1 million records in under a minute with domain size $10^9$, achieving up to $10^4$-fold higher accuracy than LDP, and up to $sim 10 imes$ faster runtime compared to baselines.