🤖 AI Summary
To address performance bottlenecks in LSM-based key-value stores (KVS) on heterogeneous non-volatile memory hierarchies—caused by static data placement—this paper proposes a dynamic, workload-aware tiering method that adapts to concurrent workloads’ varying bandwidth, capacity, and concurrency demands. Our approach introduces three core innovations: (1) concurrency-aware data placement; (2) a persistent read-only cache; and (3) context-driven I/O differentiation. The proposed storage-hierarchy-aware scheduler tightly integrates LSM-level semantics, device-level QoS modeling, and runtime concurrency profiling, enabling zero-configuration, self-adaptive scheduling without prior workload characterization. It is fully compatible with mainstream KVS engines including RocksDB, LevelDB, and Speedb. Experimental evaluation under both realistic and synthetic workloads demonstrates up to 4× higher write-intensive throughput and up to 18× higher read-intensive throughput compared to state-of-the-art baselines, significantly unlocking the potential of heterogeneous hardware.
📝 Abstract
We present Keigo, a concurrency- and workload-aware storage middleware that enhances the performance of log-structured merge key-value stores (LSM KVS) when they are deployed on a hierarchy of storage devices. The key observation behind Keigo is that there is no one-size-fits-all placement of data across the storage hierarchy that optimizes for all workloads. Hence, to leverage the benefits of combining different storage devices, Keigo places files across different devices based on their parallelism, I/O bandwidth, and capacity. We introduce three techniques - concurrency-aware data placement, persistent read-only caching, and context-based I/O differentiation. Keigo is portable across different LSMs, is adaptable to dynamic workloads, and does not require extensive profiling. Our system enables established production KVS such as RocksDB, LevelDB, and Speedb to benefit from heterogeneous storage setups. We evaluate Keigo using synthetic and realistic workloads, showing that it improves the throughput of production-grade LSMs up to 4x for write- and 18x for read-heavy workloads when compared to general-purpose storage systems and specialized LSM KVS.