Within-Dataset Disclosure Risk for Differential Privacy

📅 2023-10-19

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

In differential privacy (DP) deployment, controllers struggle to interpret how the global privacy parameter ε quantifies individual privacy loss, leading to suboptimal and opaque ε selection. Method: We propose Relative Disclosure Risk (RDR), a dataset-level, individual-centric metric that formally quantifies disclosure risk relative to other individuals; derive its theoretical properties; design an ε-adaptive selection algorithm satisfying DP and enabling preference-driven, automated configuration; and introduce a cumulative leakage bounding mechanism for multi-round queries—without requiring a pre-specified total privacy budget. Contribution/Results: Through rigorous DP analysis, IRB-approved user studies, and scalability experiments, we demonstrate that RDR significantly improves the reasonableness and interpretability of ε decisions; the algorithm is efficient and scalable; and the mechanism strictly bounds cumulative privacy leakage across adaptive query sequences while preserving utility.

📝 Abstract

Differential privacy (DP) enables private data analysis. In a typical DP deployment, controllers manage individuals' sensitive data and are responsible for answering analysts' queries while protecting individuals' privacy. They do so by choosing the privacy parameter $epsilon$, which controls the degree of privacy for all individuals in all possible datasets. However, it is challenging for controllers to choose $epsilon$ because of the difficulty of interpreting the privacy implications of such a choice on the within-dataset individuals. To address this challenge, we first derive a relative disclosure risk indicator (RDR) that indicates the impact of choosing $epsilon$ on the within-dataset individuals' disclosure risk. We then design an algorithm to find $epsilon$ based on controllers' privacy preferences expressed as a function of the within-dataset individuals' RDRs, and an alternative algorithm that finds and releases $epsilon$ while satisfying DP. Lastly, we propose a solution that bounds the total privacy leakage when using the algorithm to answer multiple queries without requiring controllers to set the total privacy budget. We evaluate our contributions through an IRB-approved user study that shows the RDR is useful for helping controllers choose $epsilon$, and experimental evaluations showing our algorithms are efficient and scalable.

Problem

Research questions and friction points this paper is trying to address.

Challenges in choosing the privacy parameter ε for differential privacy.

Developing a relative disclosure risk indicator for within-dataset individuals.

Designing algorithms to manage privacy leakage across multiple queries.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed relative disclosure risk indicator (RDR)

Designed algorithm to determine privacy parameter ε

Proposed solution to bound total privacy leakage

🔎 Similar Papers

Differentially Private Federated Learning: A Systematic Review