SwarmIO: Towards 100 Million IOPS SSD Emulation for Next-generation GPU-centric Storage Systems

📅 2026-04-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing SSD simulators struggle to efficiently support GPU-initiated, highly concurrent I/O workloads and thus fail to accurately evaluate GPU-centric storage systems optimized for ultra-high random read IOPS. This work presents the first high-performance SSD simulator capable of modeling GPU-initiated I/O, achieving substantial gains in simulation efficiency and scalability through a highly scalable frontend architecture, lightweight GPU I/O path emulation, and efficient timing modeling. Under GPU I/O workloads, the proposed simulator achieves a 303.9× speedup over the state-of-the-art. Furthermore, in vector search applications, scaling SSD IOPS from 2.5 million to 40 million yields up to a 9.7× end-to-end speedup.
📝 Abstract
GPU-initiated I/O has emerged as a key mechanism for achieving high-throughput storage access by leveraging massive GPU thread-level parallelism, while recent industry trends point toward SSDs optimized for ultra-high random-read IOPS. Together, these trends are enabling the emergence of IOPS-optimized, GPU-centric storage systems. Despite this momentum, no existing framework enables quantitative end-to-end evaluation of storage systems optimized for GPU-initiated I/O. While conventional SSD emulators provide a promising path toward end-to-end modeling in traditional storage systems, they face three key challenges in this GPU-centric setting: limited frontend scalability for ingesting massive request streams, high software overhead in emulating GPU-initiated I/O control and data paths, and excessive timing-model maintenance overhead at extremely high I/O request rates. We propose SwarmIO, an SSD emulator for massively parallel, GPU-centric storage. SwarmIO faithfully models IOPS-optimized SSDs at target performance levels of up to 40 MIOPS, achieving a 303.9x speedup over the state-of-the-art baseline SSD emulator under GPU-initiated I/O. We further demonstrate its utility through a vector search case study, showing that increasing SSD IOPS from 2.5 MIOPS to 40 MIOPS yields an average end-to-end speedup of up to 9.7x.
Problem

Research questions and friction points this paper is trying to address.

GPU-initiated I/O
SSD emulation
IOPS
storage systems
scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

SwarmIO
GPU-initiated I/O
SSD emulation
high IOPS
storage system simulation