Testing Storage-System Correctness: Challenges, Fuzzing Limitations, and AI-Augmented Opportunities

📅 2026-02-02

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Storage systems harbor subtle bugs related to persistence, consistency, and crash recovery that are difficult to expose. Traditional testing approaches are hindered by challenges such as non-deterministic thread interleavings, long-running state evolutions, and cross-layer semantic gaps. This work proposes a storage-centric testing taxonomy that systematically organizes techniques including concurrent testing, crash-consistency validation, and distributed fault injection, while uncovering a fundamental mismatch between conventional fuzzing and storage semantics. To address this gap, the paper introduces an AI-enhanced fuzzing framework that incorporates state awareness and semantic guidance. It presents the first systematic demonstration of AI’s potential to improve correctness verification in storage systems, offering a novel direction for integrating intelligent techniques into system testing methodologies.

Technology Category

Application Category

📝 Abstract

Storage systems are fundamental to modern computing infrastructures, yet ensuring their correctness remains challenging in practice. Despite decades of research on system testing, many storage-system failures (including durability, ordering, recovery, and consistency violations) remain difficult to expose systematically. This difficulty stems not primarily from insufficient testing tooling, but from intrinsic properties of storage-system execution, including nondeterministic interleavings, long-horizon state evolution, and correctness semantics that span multiple layers and execution phases. This survey adopts a storage-centric view of system testing and organizes existing techniques according to the execution properties and failure mechanisms they target. We review a broad spectrum of approaches, ranging from concurrency testing and long-running workloads to crash-consistency analysis, hardware-level semantic validation, and distributed fault injection, and analyze their fundamental strengths and limitations. Within this framework, we examine fuzzing as an automated testing paradigm, highlighting systematic mismatches between conventional fuzzing assumptions and storage-system semantics, and discuss how recent artificial intelligence advances may complement fuzzing through state-aware and semantic guidance. Overall, this survey provides a unified perspective on storage-system correctness testing and outlines key challenges

Problem

Research questions and friction points this paper is trying to address.

storage-system correctness

durability

consistency

crash recovery

ordering violations

Innovation

Methods, ideas, or system contributions that make the work stand out.

AI-augmented testing

storage-system correctness

semantic-aware fuzzing