Revisiting the attacker's knowledge in inference attacks against Searchable Symmetric Encryption

📅 2025-04-14
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
This work investigates the dependence of inference attacks in Searchable Symmetric Encryption (SSE) on the quality of “similar data” available to the adversary. We propose the first general statistical analysis framework that formally defines “similar data” and reveals how its non-uniqueness critically impacts attack robustness. We prove that index size constraints significantly degrade inference attack efficacy and derive a provably secure lower bound on the required index size. Within the leakage-abuse model, we integrate probabilistic modeling with statistical estimation theory and empirically validate our findings on the Enron dataset: imposing an index size cap of 200 reduces the optimal inference attack’s accuracy to below 5% with high probability. Our results yield the first quantifiable, data-similarity-aware defense configuration guideline for SSE systems—bridging theoretical security guarantees with practical deployment constraints.

Technology Category

Application Category

📝 Abstract
Encrypted search schemes have been proposed to address growing privacy concerns. However, several leakage-abuse attacks have highlighted some security vulnerabilities. Recent attacks assumed an attacker's knowledge containing data ``similar'' to the indexed data. However, this vague assumption is barely discussed in literature: how likely is it for an attacker to obtain a"similar enough"data? Our paper provides novel statistical tools usable on any attack in this setting to analyze its sensitivity to data similarity. First, we introduce a mathematical model based on statistical estimators to analytically understand the attackers' knowledge and the notion of similarity. Second, we conceive statistical tools to model the influence of the similarity on the attack accuracy. We apply our tools on three existing attacks to answer questions such as: is similarity the only factor influencing accuracy of a given attack? Third, we show that the enforcement of a maximum index size can make the ``similar-data'' assumption harder to satisfy. In particular, we propose a statistical method to estimate an appropriate maximum size for a given attack and dataset. For the best known attack on the Enron dataset, a maximum index size of 200 guarantees (with high probability) the attack accuracy to be below 5%.
Problem

Research questions and friction points this paper is trying to address.

Analyzing sensitivity of inference attacks to data similarity
Modeling attacker's knowledge impact on attack accuracy
Determining maximum index size to limit attack success
Innovation

Methods, ideas, or system contributions that make the work stand out.

Statistical model for attacker's knowledge similarity
Tools to analyze attack accuracy sensitivity
Method to estimate maximum secure index size
🔎 Similar Papers
No similar papers found.
M
Marc Damie
Inria, France; University of Twente, The Netherlands
J
Jean-Benoist Leger
Université de technologie de Compiègne, CNRS, Heudiasyc, France; Université Paris-Saclay, AgroParisTech, INRAE, UMR MIA Paris-Saclay
Florian Hahn
Florian Hahn
University of Twente
Computer SecurityApplied Cryptography
Andreas Peter
Andreas Peter
Professor, Safety-Security-Interaction Group, Carl von Ossietzky Universität Oldenburg, Germany
SecurityPrivacy