🤖 AI Summary
Traditional key-value stores face a fundamental trade-off among memory overhead, read performance, and write performance—a “trilemma” that limits adaptability. This paper introduces TurtleKV, the first KV system enabling *online, dynamic read-write performance tuning*. Its core innovations are: (1) a bias-free on-disk data structure that eliminates the write bias inherent in LSM-trees; and (2) a fine-grained, runtime memory allocation mechanism that enables bidirectional, real-time optimization of read and write throughput. Evaluated on YCSB, TurtleKV achieves, at comparable space amplification, 8× higher write throughput and 5× higher read throughput than RocksDB; versus SplinterDB, it delivers 40% faster point lookups, 6× faster range scans, and 50% lower space amplification. TurtleKV is the first system to jointly optimize read and write performance under tight space amplification constraints while supporting online, adaptive configuration.
📝 Abstract
High read and write performance is important for generic key/value stores, which are fundamental to modern applications and databases. Yet, achieving high performance for both reads and writes is challenging due to traditionally limited memory and the pick-any-two-out-of-three tradeoff between memory use, read performance, and write performance. Existing state-of-the-art approaches limit memory usage and chose a primary dimension (reads or writes) for which to optimize their on-disk structures. They recover performance in the remaining dimension by other mechanisms. This approach limits databases' maximum performance in the remaining dimension and their dynamic (online) tunability to respond to changing workloads. We explore a different approach that dynamically trades memory for read or write performance as needed. We present TurtleKV, which includes a novel unbiased data structure for on-disk storage. It includes a knob that dynamically increases memory reserved for increasing read or write performance. When evaluated on YCSB, TurtleKV achieves up to 8x the write throughput of industry-leader RocksDB and up to 5x the read throughput while incurring similar space amplification. Compared to the state-of-the-art system SplinterDB, TurtleKV runs up to 40% better on point queries, up to 6x better on range scans and achieves similar write performance, while incurring 50% less space amplification.