🤖 AI Summary
Existing model checkers struggle to simultaneously achieve sound safety verification and effective counterexample generation for programs containing long, input-dependent erroneous paths. This paper proposes the GPS algorithm, which formulates model checking as an abstraction-guided directed state-space search: lightweight path abstractions are generated via compositional static analysis to guide depth-first exploration; a two-level search strategy is designed to coordinate verification and falsification; and a customized, completeness-guaranteed instrumentation mechanism is introduced. GPS is the first approach to unify efficient counterexample discovery with rigorous refutation completeness. Evaluated on the SV-COMP benchmark, GPS increases the number of solved instances by 12.7% and reduces average runtime by 34.5%, significantly outperforming state-of-the-art tools.
📝 Abstract
In this work, we describe a new software model-checking algorithm called GPS. GPS treats the task of model checking a program as a directed search of the program states, guided by a compositional, summary-based static analysis. The summaries produced by static analysis are used both to prune away infeasible paths and to drive test generation to reach new, unexplored program states. GPS can find both proofs of safety and counter-examples to safety (i.e., inputs that trigger bugs), and features a novel two-layered search strategy that renders it particularly efficient at finding bugs in programs featuring long, input-dependent error paths. To make GPS refutationally complete (in the sense that it will find an error if one exists, if it is allotted enough time), we introduce an instrumentation technique and show that it helps GPS achieve refutation-completeness without sacrificing overall performance. We benchmarked GPS on a suite of benchmarks including both programs from the Software Verification Competition (SV-COMP) and from prior literature, and found that our implementation of GPS outperforms state-of-the-art software model checkers (including the top performers in SV-COMP ReachSafety-Loops category), both in terms of the number of benchmarks solved and in terms of running time.