Revisiting the attacker's knowledge in inference attacks against Searchable Symmetric Encryption

📅 2025-04-14

📈 Citations: 1

✨ Influential: 0

career value

237K/year

🤖 AI Summary

This work investigates the dependence of inference attacks in Searchable Symmetric Encryption (SSE) on the quality of “similar data” available to the adversary. We propose the first general statistical analysis framework that formally defines “similar data” and reveals how its non-uniqueness critically impacts attack robustness. We prove that index size constraints significantly degrade inference attack efficacy and derive a provably secure lower bound on the required index size. Within the leakage-abuse model, we integrate probabilistic modeling with statistical estimation theory and empirically validate our findings on the Enron dataset: imposing an index size cap of 200 reduces the optimal inference attack’s accuracy to below 5% with high probability. Our results yield the first quantifiable, data-similarity-aware defense configuration guideline for SSE systems—bridging theoretical security guarantees with practical deployment constraints.

Technology Category

Application Category

📝 Abstract

Encrypted search schemes have been proposed to address growing privacy concerns. However, several leakage-abuse attacks have highlighted some security vulnerabilities. Recent attacks assumed an attacker's knowledge containing data ``similar'' to the indexed data. However, this vague assumption is barely discussed in literature: how likely is it for an attacker to obtain a"similar enough"data? Our paper provides novel statistical tools usable on any attack in this setting to analyze its sensitivity to data similarity. First, we introduce a mathematical model based on statistical estimators to analytically understand the attackers' knowledge and the notion of similarity. Second, we conceive statistical tools to model the influence of the similarity on the attack accuracy. We apply our tools on three existing attacks to answer questions such as: is similarity the only factor influencing accuracy of a given attack? Third, we show that the enforcement of a maximum index size can make the ``similar-data'' assumption harder to satisfy. In particular, we propose a statistical method to estimate an appropriate maximum size for a given attack and dataset. For the best known attack on the Enron dataset, a maximum index size of 200 guarantees (with high probability) the attack accuracy to be below 5%.

Problem

Research questions and friction points this paper is trying to address.

Analyzing sensitivity of inference attacks to data similarity

Modeling attacker's knowledge impact on attack accuracy

Determining maximum index size to limit attack success

Innovation

Methods, ideas, or system contributions that make the work stand out.

Statistical model for attacker's knowledge similarity

Tools to analyze attack accuracy sensitivity

Method to estimate maximum secure index size

🔎 Similar Papers

Ciphertext Malleability in Lattice-Based KEMs as a Countermeasure to Side Channel Analysis