Multi-Queue SSD I/O Modeling & Its Implications for Data Structure Design

📅 2025-07-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing storage performance models—such as the Disk Access Model (DAM)—fail to accurately capture the concurrent I/O characteristics of multi-queue SSDs, hindering hardware-aware co-design of external-memory algorithms. Method: We propose MQSSD, a novel storage abstraction model that explicitly incorporates multi-queue parallelism as a fundamental dimension—revealing concurrent access as the key mechanism enabling modern SSDs’ high throughput. MQSSD is derived from joint empirical characterization of real SSD hardware and LSM-tree engines (e.g., RocksDB), combining analytical modeling with system-level validation. Contribution/Results: MQSSD achieves significantly higher prediction accuracy than DAM. Guided by MQSSD, we design an LSM-tree variant optimized for multi-queue SSDs, establishing a new external-memory data structure paradigm centered on “high concurrency and low serial dependency.” This provides a scalable theoretical foundation for hardware-aware algorithm design.

Technology Category

Application Category

📝 Abstract
Understanding the performance profiles of storage devices and how best to utilize them has always been non-trivial due to factors such as seek times, caching, scheduling, concurrent access, flash wear-out, and garbage collection. However, analytical frameworks that provide simplified abstractions of storage performance can still be accurate enough to evaluate external memory algorithms and data structures at the design stage. For example, the Disk Access Machine (DAM) model assumes that a storage device transfers data in fixed-size blocks of size B and that all transfers have unit latency. This abstraction is already sufficient to explain some of the benefits of data structures such as B-trees and Log-Structured Merge trees (LSM trees); however, storage technology advances have significantly reduced current models' accuracy and utility. This paper introduces the Multi-Queue Solid State Drive (MQSSD) model, a new storage abstraction. This model builds upon previous models and aims to more accurately represent the performance characteristics of modern storage hardware. We identify key performance-critical aspects of modern multi-queue solid-state drives on which we base our model and demonstrate these characteristics on actual hardware. We then show how our model can be applied to LSM-tree-based storage engines to optimize them for modern storage hardware. We highlight that leveraging concurrent access is crucial for fully utilizing the high throughput of multi-queue SSDs, enabling designs that may appear counterintuitive under traditional paradigms We then validate these insights through experiments using Facebook's LSM-tree-based key-value store, RocksDB. We conclude that the MQSSD model offers a more accurate abstraction of modern hardware than previous models, allowing for greater insight and optimization.
Problem

Research questions and friction points this paper is trying to address.

Modeling performance of multi-queue SSDs for accurate storage abstraction
Optimizing LSM-tree-based storage engines for modern SSD hardware
Enhancing concurrent access utilization in high-throughput SSD designs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces Multi-Queue SSD (MQSSD) model
Optimizes LSM-trees for modern SSDs
Leverages concurrent access for throughput