๐ค AI Summary
When multiple data owners cannot share raw data, conventional secure collaborative analytics (SCA) systems face three fundamental challenges: unbounded privacy loss, suboptimal query planning, and lossy analysis. To address these, this paper proposes the first private-summary-based collaborative analytics framework, which simultaneously achieves bounded privacy loss, high-level query planning, and lossless analysis. Our method introduces precomputed differentially private indexes and operation-cost estimation to eliminate runtime privacy overhead; employs a one-sided noise mechanism, private upper-bound estimation, and data-agnostic index construction; and integrates differential privacyโenhanced execution plan optimization. Experimental results demonstrate that, compared to state-of-the-art approaches, our framework accelerates queries by up to 80ร, reduces memory consumption for complex queries by up to 900ร, and decreases cumulative privacy loss by up to 89ร in continuous analytical workloads.
๐ Abstract
Secure collaborative analytics (SCA) enables the processing of analytical SQL queries across data from multiple owners, even when direct data sharing is not possible. While traditional SCA provides strong privacy through data-oblivious methods, the significant overhead has limited its practical use. Recent SCA variants that allow controlled leakages under differential privacy (DP) strike balance between privacy and efficiency but still face challenges like unbounded privacy loss, costly execution plan, and lossy processing.
To address these challenges, we introduce SPECIAL, the first SCA system that simultaneously ensures bounded privacy loss, advanced query planning, and lossless processing. SPECIAL employs a novel
synopsis-assisted secure processing model
, where a one-time privacy cost is used to generate private synopses from owner data. These synopses enable SPECIAL to estimate compaction sizes for secure operations (e.g., filter, join) and index encrypted data without additional privacy loss. These estimates and indexes can be prepared before runtime, enabling efficient query planning and accurate cost estimations. By leveraging one-sided noise mechanisms and private upper bound techniques, SPECIAL guarantees lossless processing for complex queries (e.g., multi-join). Our comprehensive benchmarks demonstrate that SPECIAL outperforms state-of-the-art SCAs, with up to 80ร faster query times, 900ร smaller memory usage for complex queries, and up to 89ร reduced privacy loss in continual processing.