SPECIAL: Synopsis Assisted Secure Collaborative Analytics

๐Ÿ“… 2024-04-29
๐Ÿ›๏ธ Proceedings of the VLDB Endowment
๐Ÿ“ˆ Citations: 1
โœจ Influential: 1
๐Ÿ“„ PDF
๐Ÿค– AI Summary
When multiple data owners cannot share raw data, conventional secure collaborative analytics (SCA) systems face three fundamental challenges: unbounded privacy loss, suboptimal query planning, and lossy analysis. To address these, this paper proposes the first private-summary-based collaborative analytics framework, which simultaneously achieves bounded privacy loss, high-level query planning, and lossless analysis. Our method introduces precomputed differentially private indexes and operation-cost estimation to eliminate runtime privacy overhead; employs a one-sided noise mechanism, private upper-bound estimation, and data-agnostic index construction; and integrates differential privacyโ€“enhanced execution plan optimization. Experimental results demonstrate that, compared to state-of-the-art approaches, our framework accelerates queries by up to 80ร—, reduces memory consumption for complex queries by up to 900ร—, and decreases cumulative privacy loss by up to 89ร— in continuous analytical workloads.

Technology Category

Application Category

๐Ÿ“ Abstract
Secure collaborative analytics (SCA) enables the processing of analytical SQL queries across data from multiple owners, even when direct data sharing is not possible. While traditional SCA provides strong privacy through data-oblivious methods, the significant overhead has limited its practical use. Recent SCA variants that allow controlled leakages under differential privacy (DP) strike balance between privacy and efficiency but still face challenges like unbounded privacy loss, costly execution plan, and lossy processing. To address these challenges, we introduce SPECIAL, the first SCA system that simultaneously ensures bounded privacy loss, advanced query planning, and lossless processing. SPECIAL employs a novel synopsis-assisted secure processing model , where a one-time privacy cost is used to generate private synopses from owner data. These synopses enable SPECIAL to estimate compaction sizes for secure operations (e.g., filter, join) and index encrypted data without additional privacy loss. These estimates and indexes can be prepared before runtime, enabling efficient query planning and accurate cost estimations. By leveraging one-sided noise mechanisms and private upper bound techniques, SPECIAL guarantees lossless processing for complex queries (e.g., multi-join). Our comprehensive benchmarks demonstrate that SPECIAL outperforms state-of-the-art SCAs, with up to 80ร— faster query times, 900ร— smaller memory usage for complex queries, and up to 89ร— reduced privacy loss in continual processing.
Problem

Research questions and friction points this paper is trying to address.

Ensuring bounded privacy loss in collaborative analytics
Optimizing query planning for secure data processing
Achieving lossless processing for complex analytical queries
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses synopsis-assisted secure processing model
Employs one-sided noise for lossless processing
Prepares estimates and indexes pre-runtime
๐Ÿ”Ž Similar Papers
No similar papers found.