🤖 AI Summary
Crash-consistency testing faces exponential growth of the crash-state space with operation count; existing pruning techniques either sacrifice coverage (e.g., error-pattern- or range-based restrictions) or lack scalability (e.g., mere redundant-state deduplication). This paper introduces an application-layer-oriented representative testing paradigm: clustering non-equivalent yet consistency-related crash states based on similarity of update behaviors, achieving high coverage while significantly improving scalability. We implement this approach in Pathfinder, a tool integrating behavioral modeling, heuristic state reduction, and POSIX/MMIO-semantics-aware crash injection. Evaluated on eight production-grade systems, Pathfinder detects 18 crash-consistency bugs—including 7 previously unknown ones—with detection rates for POSIX and MMIO scenarios reaching 4× and 8× those of state-of-the-art methods, respectively.
📝 Abstract
Crash consistency is essential for applications that must persist data. Crash-consistency testing has been commonly applied to find crash-consistency bugs in applications. The crash-state space grows exponentially as the number of operations in the program increases, necessitating techniques for pruning the search space. However, state-of-the-art crash-state space pruning is far from ideal. Some techniques look for known buggy patterns or bound the exploration for efficiency, but they sacrifice coverage and may miss bugs lodged deep within applications. Other techniques eliminate redundancy in the search space by skipping identical crash states, but they still fail to scale to larger applications. In this work, we propose representative testing: a new crash-state space reduction strategy that achieves high scalability and high coverage. Our key observation is that the consistency of crash states is often correlated, even if those crash states are not identical. We build Pathfinder, a crash-consistency testing tool that implements an update behaviors-based heuristic to approximate a small set of representative crash states. We evaluate Pathfinder on POSIX-based and MMIO-based applications, where it finds 18 (7 new) bugs across 8 production-ready systems. Pathfinder scales more effectively to large applications than prior works and finds 4x more bugs in POSIX-based applications and 8x more bugs in MMIO-based applications compared to state-of-the-art systems.