O^3-LSM: Maximizing Disaggregated LSM Write Performance via Three-Layer Offloading

📅 2026-03-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the write throughput bottleneck in existing disaggregated LSM-based key-value stores, which is constrained by compute node memory capacity and flush bandwidth. To overcome these limitations, the authors propose O³-LSM, a novel architecture featuring a three-tier offloading mechanism: dynamic memtable offloading, cooperative flush offloading, and shard-level parallel transmission, which collectively unlock the full potential of disaggregated memory. Additionally, a cache-enhanced read delegation strategy is introduced to optimize read performance. Experimental results demonstrate that O³-LSM achieves up to 4.5× higher write throughput, 5.2× faster range queries, and 1.8× improved point lookup performance compared to state-of-the-art systems, while reducing P99 latency by 76%.

Technology Category

Application Category

📝 Abstract
Log-Structured Merge-tree-based Key-Value Stores (LSM-KVS) have been optimized and redesigned for disaggregated storage via techniques such as compaction offloading to reduce the network I/Os between compute and storage. However, the constrained memory space and slow flush at the compute node severely limit the overall write throughput of existing optimizations. In this paper, we propose O3-LSM, a fundamental new LSM-KVS architecture, that leverages the shared Disaggregated Memory (DM) to support a three-layer offloading, i.e., memtable Offloading, flush Offloading, and the existing compaction Offloading. Compared to the existing disaggregated LSM-KVS with compaction offloading only, O3-LSM maximizes the write performance by addressing the above issues. O3-LSM first leverages a novel DM-Optimized Memtable to achieve dynamic memtable offloading, which extends the write buffer while enabling fast, asynchronous, and parallel memtable transmission. Second, we propose Collaborative Flush Offloading that decouples the flush control plane from execution and supports memtable flush offloading at any node with dedicated scheduling and global optimizations. Third, O3-LSM is further improved with the Shard-Level Optimization, which partitions the memtable into shards based on disjoint key-ranges that can be transferred and flushed independently, unlocking parallelism across shards. Besides, to mitigate slow lookups in the disaggregated setting, O3-LSM also employs an adaptive Cache-Enhanced Read Delegation mechanism to combine a compact local cache with DM-assisted memtable delegated read. Our evaluation shows that O3-LSM achieves up to 4.5X write, 5.2X range query, and 1.8X point lookup throughput improvement, and up to 76% P99 latency reduction compared with Disaggregated-RocksDB, CaaS-LSM, and Nova-LSM.
Problem

Research questions and friction points this paper is trying to address.

LSM-KVS
disaggregated storage
write throughput
memtable flush
memory constraint
Innovation

Methods, ideas, or system contributions that make the work stand out.

three-layer offloading
disaggregated memory
memtable offloading
collaborative flush
shard-level optimization