How to Get Actual Privacy and Utility from Privacy Models: the k-Anonymity and Differential Privacy Families

📅 2025-10-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing mainstream privacy models suffer from fundamental limitations: k-anonymity operates syntactically, rendering it vulnerable to background knowledge attacks and lacking semantic constraints; differential privacy faces a sharp utility–privacy trade-off—small privacy budgets cause severe data distortion, while large budgets degrade privacy guarantees. Method: We propose Semantic k-Anonymity, which formally incorporates domain-specific semantic constraints and dependencies among sensitive attributes to reconstruct the equivalence-class partitioning mechanism—enhancing disclosure resistance without compromising data utility. Contribution/Results: Through rigorous formal modeling, principled semantic constraint design, and empirical risk assessment, we demonstrate that Semantic k-Anonymity achieves more robust privacy protection and higher data utility than conventional k-anonymity and differential privacy in realistic settings, thereby reducing reliance on post-hoc risk evaluation.

Technology Category

Application Category

📝 Abstract
Privacy models were introduced in privacy-preserving data publishing and statistical disclosure control with the promise to end the need for costly empirical assessment of disclosure risk. We examine how well this promise is kept by the main privacy models. We find they may fail to provide adequate protection guarantees because of problems in their definition or incur unacceptable trade-offs between privacy protection and utility preservation. Specifically, k-anonymity may not entirely exclude disclosure if enforced with deterministic mechanisms or without constraints on the confidential values. On the other hand, differential privacy (DP) incurs unacceptable utility loss for small budgets and its privacy guarantee becomes meaningless for large budgets. In the latter case, an ex post empirical assessment of disclosure risk becomes necessary, undermining the main appeal of privacy models. Whereas the utility preservation of DP can only be improved by relaxing its privacy guarantees, we argue that a semantic reformulation of k-anonymity can offer more robust privacy without losing utility with respect to traditional syntactic k-anonymity.
Problem

Research questions and friction points this paper is trying to address.

k-anonymity fails to fully prevent data disclosure risks
Differential privacy causes excessive utility loss with small budgets
Current privacy models require costly empirical risk assessments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic reformulation enhances k-anonymity privacy robustness
Differential privacy faces utility loss with strict privacy budgets
Deterministic k-anonymity mechanisms may not prevent full disclosure
🔎 Similar Papers
No similar papers found.
Josep Domingo-Ferrer
Josep Domingo-Ferrer
Distinguished Full Professor, Universitat Rovira i Virgili, Director-CYBERCAT, FIEEE, ACM DS
Data protectionPrivacyCybersecurityMachine learningStatistical Disclosure Control
D
David Sánchez
Department of Computer Science and Mathematics, CYBERCAT-Center for Cybersecurity Research of Catalonia, Universitat Rovira i Virgili, Av. Països Catalans, 26, Tarragona, 43007, Catalonia