🤖 AI Summary
While LSM-trees optimize write throughput, their compaction and flush operations induce fundamental trade-offs—namely write, read, and space amplification—along with performance volatility and resource contention, exacerbated in multi-tenant distributed environments. This paper presents a systematic survey of LSM-tree optimization research from 2019 to 2024, employing bibliometric analysis and technical taxonomy to categorize advances in tiered compaction, cache-aware coordination, read/write path separation, workload-aware scheduling, and distributed resource management. Distinct from prior surveys, we uniquely analyze the co-design challenges arising from multi-tier storage hierarchies, heterogeneous workloads, and cross-tenant resource contention; rigorously characterize the Pareto boundaries among the three amplification effects; and propose future directions targeting high throughput, low latency, and strong tenant isolation. Our synthesis provides both theoretical foundations and practical engineering guidance for next-generation high-performance key-value stores. (149 words)
📝 Abstract
LSM-tree is a widely adopted data structure in modern key-value store systems that optimizes write performance in write-heavy applications by using append writes to achieve sequential writes. However, the unpredictability of LSM-tree compaction introduces significant challenges, including performance variability during peak workloads and in resource-constrained environments, write amplification caused by data rewriting during compactions, read amplification from multi-level queries, trade-off between read and write performance, as well as efficient space utilization to mitigate space amplification. Prior studies on LSM-tree optimizations have addressed the above challenges; however, in recent years, research on LSM-tree optimization has continued to propose. The goal of this survey is to review LSM-tree optimization, focusing on representative works in the past five years. This survey first studies existing solutions on how to mitigate the performance impact of LSM-tree flush and compaction and how to improve basic key-value operations. In addition, distributed key-value stores serve multi-tenants, ranging from tens of thousands to millions of users with diverse requirements. We then analyze the new challenges and opportunities in these modern architectures and across various application scenarios. Unlike the existing survey papers, this survey provides a detailed discussion of the state-of-the-art work on LSM-tree optimizations and gives future research directions.