From Randomized Response to Randomized Index: Answering Subset Counting Queries with Local Differential Privacy

📅 2025-04-24

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

This paper addresses subset counting queries over set-valued data under Local Differential Privacy (LDP), where conventional value-perturbation approaches suffer from limited statistical utility. Method: We propose a novel index randomization paradigm—instead of perturbing raw set elements, we randomize their indices in an encoded space. Specifically, we introduce the CRIAD framework, which integrates multi-virtual-item encoding, multi-sampling aggregation, and multi-group error suppression to enhance estimation accuracy while strictly preserving LDP. Contribution/Results: We formally prove that CRIAD satisfies ε-LDP. Extensive experiments demonstrate that it consistently outperforms state-of-the-art value-perturbation mechanisms across diverse domain sizes and privacy budgets (ε), achieving superior query accuracy, scalability, and flexibility—all without compromising privacy guarantees.

Technology Category

Application Category

📝 Abstract

Local Differential Privacy (LDP) is the predominant privacy model for safeguarding individual data privacy. Existing perturbation mechanisms typically require perturbing the original values to ensure acceptable privacy, which inevitably results in value distortion and utility deterioration. In this work, we propose an alternative approach -- instead of perturbing values, we apply randomization to indexes of values while ensuring rigorous LDP guarantees. Inspired by the deniability of randomized indexes, we present CRIAD for answering subset counting queries on set-value data. By integrating a multi-dummy, multi-sample, and multi-group strategy, CRIAD serves as a fully scalable solution that offers flexibility across various privacy requirements and domain sizes, and achieves more accurate query results than any existing methods. Through comprehensive theoretical analysis and extensive experimental evaluations, we validate the effectiveness of CRIAD and demonstrate its superiority over traditional value-perturbation mechanisms.

Problem

Research questions and friction points this paper is trying to address.

Ensuring Local Differential Privacy without value distortion

Answering subset counting queries accurately with randomized indexes

Improving utility in privacy-preserving data analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Randomizes indexes instead of values for privacy

Uses multi-dummy multi-sample multi-group strategy

Ensures rigorous Local Differential Privacy guarantees

🔎 Similar Papers

Differentially Private High-Dimensional Approximate Range Counting, Revisited