RIPOST: Two-Phase Private Decomposition for Multidimensional Data

📅 2025-02-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In differentially private (DP) high-dimensional data publishing, existing methods suffer from rigid privacy budget allocation and domain decomposition reliant on manually specified recursion depth. Method: We propose a two-stage data-aware domain decomposition framework: first separating non-empty from empty subdomains, then recursively partitioning non-empty subdomains based on the mean function—without requiring a pre-specified depth. We further introduce an adaptive splitting strategy and a dynamic privacy budget allocation mechanism, ensuring minimal decomposition granularity while minimizing query error. Contribution/Results: Our approach overcomes the limitations of fixed-depth decomposition, jointly optimizing structural coherence and utility. Evaluated on multiple real-world datasets, it significantly outperforms state-of-the-art methods, achieving an average 37% improvement in query accuracy—substantially enhancing the practicality and analytical utility of released data.

Technology Category

Application Category

📝 Abstract
Differential privacy (DP) is considered as the gold standard for data privacy. While the problem of answering simple queries and functions under DP guarantees has been thoroughly addressed in recent years, the problem of releasing multidimensional data under DP remains challenging. In this paper, we focus on this problem, in particular on how to construct privacy-preserving views using a domain decomposition approach. The main idea is to recursively split the domain into sub-domains until a convergence condition is met. The resulting sub-domains are perturbed and then published in order to be used to answer arbitrary queries. Existing methods that have addressed this problem using domain decomposition face two main challenges: (i) efficient privacy budget management over a variable and undefined decomposition depth $h$; and (ii) defining an optimal data-dependent splitting strategy that minimizes the error in the sub-domains while ensuring the smallest possible decomposition. To address these challenges, we present RIPOST, a multidimensional data decomposition algorithm that bypasses the constraint of predefined depth $h$ and applies a data-aware splitting strategy to optimize the quality of the decomposition results.The core of RIPOST is a two-phase strategy that separates non-empty sub-domains at an early stage from empty sub-domains by exploiting the properties of multidimensional datasets, and then decomposes the resulting sub-domains with minimal inaccuracies using the mean function. Moreover, RIPOST introduces a privacy budget distribution that allows decomposition without requiring prior computation of the depth $h$. Through extensive experiments, we demonstrated that exttt{RIPOST} outperforms state-of-the-art methods in terms of data utility and accuracy on a variety of datasets and test cases
Problem

Research questions and friction points this paper is trying to address.

Private decomposition of multidimensional data
Efficient privacy budget management
Optimal data-dependent splitting strategy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Two-phase private decomposition strategy
Data-aware splitting minimizes errors
Privacy budget without predefined depth
🔎 Similar Papers
No similar papers found.