SPECIAL: Synopsis Assisted Secure Collaborative Analytics

📅 2024-04-29

🏛️ Proceedings of the VLDB Endowment

📈 Citations: 1

✨ Influential: 1

career value

194K/year

🤖 AI Summary

When multiple data owners cannot share raw data, conventional secure collaborative analytics (SCA) systems face three fundamental challenges: unbounded privacy loss, suboptimal query planning, and lossy analysis. To address these, this paper proposes the first private-summary-based collaborative analytics framework, which simultaneously achieves bounded privacy loss, high-level query planning, and lossless analysis. Our method introduces precomputed differentially private indexes and operation-cost estimation to eliminate runtime privacy overhead; employs a one-sided noise mechanism, private upper-bound estimation, and data-agnostic index construction; and integrates differential privacy–enhanced execution plan optimization. Experimental results demonstrate that, compared to state-of-the-art approaches, our framework accelerates queries by up to 80×, reduces memory consumption for complex queries by up to 900×, and decreases cumulative privacy loss by up to 89× in continuous analytical workloads.

Technology Category

Application Category

📝 Abstract

Secure collaborative analytics (SCA) enables the processing of analytical SQL queries across data from multiple owners, even when direct data sharing is not possible. While traditional SCA provides strong privacy through data-oblivious methods, the significant overhead has limited its practical use. Recent SCA variants that allow controlled leakages under differential privacy (DP) strike balance between privacy and efficiency but still face challenges like unbounded privacy loss, costly execution plan, and lossy processing. To address these challenges, we introduce SPECIAL, the first SCA system that simultaneously ensures bounded privacy loss, advanced query planning, and lossless processing. SPECIAL employs a novel synopsis-assisted secure processing model , where a one-time privacy cost is used to generate private synopses from owner data. These synopses enable SPECIAL to estimate compaction sizes for secure operations (e.g., filter, join) and index encrypted data without additional privacy loss. These estimates and indexes can be prepared before runtime, enabling efficient query planning and accurate cost estimations. By leveraging one-sided noise mechanisms and private upper bound techniques, SPECIAL guarantees lossless processing for complex queries (e.g., multi-join). Our comprehensive benchmarks demonstrate that SPECIAL outperforms state-of-the-art SCAs, with up to 80× faster query times, 900× smaller memory usage for complex queries, and up to 89× reduced privacy loss in continual processing.

Problem

Research questions and friction points this paper is trying to address.

Ensuring bounded privacy loss in collaborative analytics

Optimizing query planning for secure data processing

Achieving lossless processing for complex analytical queries

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses synopsis-assisted secure processing model

Employs one-sided noise for lossless processing

Prepares estimates and indexes pre-runtime

🔎 Similar Papers

No similar papers found.