🤖 AI Summary
Modern SMT solvers lack native support for string-sequence operations—such as regex matching, splitting, and concatenation—limiting their applicability to real-world string-intensive programs.
Method: This paper introduces the first formalization of the string-sequence theory and identifies a decidable straight-line fragment thereof. It encodes string-sequence operations into standard string operations augmented with automata-based preimage computation, and integrates the approach into the OSTRICH framework for constraint solving.
Contribution/Results: We implement the resulting solver as $ostrichseq$, the first tool unifying the expressive power of sequence logic and string logic. Evaluated on realistic JavaScript program-generation benchmarks, $ostrichseq$ demonstrates both efficiency and practicality, significantly advancing the state of the art in solving string-sequence constraints.
📝 Abstract
The theory of sequences, supported by many SMT solvers, can model program data types including bounded arrays and lists. Sequences are parameterized by the element data type and provide operations such as accessing elements, concatenation, forming sub-sequences and updating elements. Strings and sequences are intimately related; many operations, e.g., matching a string according to a regular expression, splitting strings, or joining strings in a sequence, are frequently used in string-manipulating programs. Nevertheless, these operations are typically not directly supported by existing SMT solvers, which instead only consider the generic theory of sequences. In this paper, we propose a theory of string sequences and study its satisfiability. We show that, while it is undecidable in general, the decidability can be recovered by restricting to the straight-line fragment. This is shown by encoding each string sequence as a string, and each string sequence operation as a corresponding string operation. We provide pre-image computation for the resulting string operations with respect to automata, effectively casting it into the generic OSTRICH string constraint solving framework. We implement the new decision procedure as a tool $ostrichseq$, and carry out experiments on benchmark constraints generated from real-world JavaScript programs, hand-crafted templates and unit tests. The experiments confirm the efficacy of our approach.