🤖 AI Summary
This work investigates the fundamental limits of frequency analysis attacks against searchable encryption schemes supporting encrypted range queries, under the realistic threat model where only access patterns and query distributions are known. To this end, we propose LAMA, a universal attack framework that, for the first time, extends frequency analysis to high-dimensional spaces and arbitrary convex query workloads; we rigorously prove its optimality and generality. Our method integrates query distribution modeling, matching-based inference, and high-dimensional plaintext reconstruction, validated through theoretical analysis and comprehensive benchmarking. Experimentally, LAMA achieves the first successful reconstruction of plaintext data from four-dimensional encrypted range queries. Moreover, we identify intrinsic structural properties of query distributions—such as sufficient entropy concentration and geometric dispersion—that confer robustness against frequency analysis, thereby establishing formal foundations and practical guidelines for designing side-channel–resistant defenses.
📝 Abstract
Searchable encryption (SE) is the most scalable cryptographic primitive for searching on encrypted data. Typical SE constructions often allow access-pattern leakage, revealing which encrypted records are retrieved in the server's responses. All the known generic cryptanalyses assume either that the queries are issued uniformly at random or that the attacker observes the search-pattern leakage. It remains unclear what can be reconstructed when using only the access-pattern leakage and knowledge of the query distribution. In this work, we focus on the cryptanalytic technique of frequency analysis in the context of leakage-abuse attacks on schemes that support encrypted range queries. Frequency analysis matches the frequency of retrieval of an encrypted record with a plaintext value based on its probability of retrieval that follows from the knowledge of the query distribution. We generalize this underexplored cryptanalytic technique and introduce a generic attack framework called Leakage-Abuse via Matching (LAMA) that works even on high-dimensional encrypted data. We identify a parameterization of LAMA that brings frequency analysis to its limit -- that is, we prove that there is no additional frequency matching that an attacker can perform to refine the result. Furthermore, we show that our results hold for any class of convex queries, and not just axis-aligned rectangles, which is the assumption in all other attacks on range schemes. Using these results, we identify query distributions that make frequency analysis challenging for the attacker and, thus, can act as a mitigation mechanism. Finally, we implement and benchmark LAMA and reconstruct, for the first time, plaintext data from encrypted range queries spanning up to four dimensions.