Prefix Siphoning: Exploiting LSM-Tree Range Filters For Information Disclosure (Full Version)

📅 2023-06-07

🏛️ USENIX Annual Technical Conference

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

This work uncovers a novel timing side-channel in key-value stores: while prior research focuses on value leakage, we demonstrate for the first time that *key disclosure* poses a severe threat. We introduce the “prefix siphoning” attack, which exploits minute timing differences in native range filters (e.g., SuRF, prefix Bloom filters) of LSM-tree–based storage engines—specifically, their response latency to non-existent key queries—to infer key prefixes and even full keys. Crucially, the attack requires no bypass of external defenses and solely leverages inherent LSM-tree structure and optimization behaviors. Through rigorous security modeling, reverse engineering of filter internals, and statistical inference, we validate the attack on mainstream engines including LevelDB and RocksDB: tens of thousands of queries suffice to recover sensitive key prefixes with high confidence—outperforming brute-force enumeration. This is the first systematic identification and exploitation of *key-level* side channels induced by LSM-tree range filters, transcending the traditional paradigm of timing attacks targeting only value leakage.

📝 Abstract

Key-value stores typically leave access control to the systems for which they act as storage engines. Unfortunately, attackers may circumvent such read access controls via timing attacks on the key-value store, which use differences in query response times to glean information about stored data. To date, key-value store timing attacks have aimed to disclose stored values and have exploited external mechanisms that can be disabled for protection. In this paper, we point out that key disclosure is also a security threat -- and demonstrate key disclosure timing attacks that exploit mechanisms of the key-value store itself. We target LSM-tree based key-value stores utilizing range filters, which have been recently proposed to optimize LSM-tree range queries. We analyze the impact of the range filters SuRF and prefix Bloom filter on LSM-trees through a security lens, and show that they enable a key disclosure timing attack, which we call prefix siphoning. Prefix siphoning successfully leverages benign queries for non-present keys to identify prefixes of actual keys -- and in some cases, full keys -- in scenarios where brute force searching for keys (via exhaustive enumeration or random guesses) is infeasible.

Problem

Research questions and friction points this paper is trying to address.

Timing attacks bypass key-value store access controls

Disclosing keys via LSM-tree range filter vulnerabilities

Exploiting query timing differences to identify key prefixes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Exploits LSM-tree range filters

Uses timing attacks for key disclosure

Leverages benign queries for prefix identification

🔎 Similar Papers

Whispers in Grammars: Injecting Covert Backdoors to Compromise Dense Retrieval Systems