Keigo: Co-designing Log-Structured Merge Key-Value Stores with a Non-Volatile, Concurrency-aware Storage Hierarchy (Extended Version)

📅 2025-06-17

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

To address performance bottlenecks in LSM-based key-value stores (KVS) on heterogeneous non-volatile memory hierarchies—caused by static data placement—this paper proposes a dynamic, workload-aware tiering method that adapts to concurrent workloads’ varying bandwidth, capacity, and concurrency demands. Our approach introduces three core innovations: (1) concurrency-aware data placement; (2) a persistent read-only cache; and (3) context-driven I/O differentiation. The proposed storage-hierarchy-aware scheduler tightly integrates LSM-level semantics, device-level QoS modeling, and runtime concurrency profiling, enabling zero-configuration, self-adaptive scheduling without prior workload characterization. It is fully compatible with mainstream KVS engines including RocksDB, LevelDB, and Speedb. Experimental evaluation under both realistic and synthetic workloads demonstrates up to 4× higher write-intensive throughput and up to 18× higher read-intensive throughput compared to state-of-the-art baselines, significantly unlocking the potential of heterogeneous hardware.

Technology Category

Application Category

📝 Abstract

We present Keigo, a concurrency- and workload-aware storage middleware that enhances the performance of log-structured merge key-value stores (LSM KVS) when they are deployed on a hierarchy of storage devices. The key observation behind Keigo is that there is no one-size-fits-all placement of data across the storage hierarchy that optimizes for all workloads. Hence, to leverage the benefits of combining different storage devices, Keigo places files across different devices based on their parallelism, I/O bandwidth, and capacity. We introduce three techniques - concurrency-aware data placement, persistent read-only caching, and context-based I/O differentiation. Keigo is portable across different LSMs, is adaptable to dynamic workloads, and does not require extensive profiling. Our system enables established production KVS such as RocksDB, LevelDB, and Speedb to benefit from heterogeneous storage setups. We evaluate Keigo using synthetic and realistic workloads, showing that it improves the throughput of production-grade LSMs up to 4x for write- and 18x for read-heavy workloads when compared to general-purpose storage systems and specialized LSM KVS.

Problem

Research questions and friction points this paper is trying to address.

Optimizes LSM KVS performance on hierarchical storage devices

Dynamically places data based on workload and device traits

Enhances throughput for both read and write-heavy workloads

Innovation

Methods, ideas, or system contributions that make the work stand out.

Concurrency-aware data placement across storage hierarchy

Persistent read-only caching for performance boost

Context-based I/O differentiation for workload adaptation

🔎 Similar Papers

LearnedKV: Integrating LSM and Learned Index for Superior Performance on Storage