Decision Procedure for A Theory of String Sequences

📅 2025-08-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Modern SMT solvers lack native support for string-sequence operations—such as regex matching, splitting, and concatenation—limiting their applicability to real-world string-intensive programs. Method: This paper introduces the first formalization of the string-sequence theory and identifies a decidable straight-line fragment thereof. It encodes string-sequence operations into standard string operations augmented with automata-based preimage computation, and integrates the approach into the OSTRICH framework for constraint solving. Contribution/Results: We implement the resulting solver as $ostrichseq$, the first tool unifying the expressive power of sequence logic and string logic. Evaluated on realistic JavaScript program-generation benchmarks, $ostrichseq$ demonstrates both efficiency and practicality, significantly advancing the state of the art in solving string-sequence constraints.

Technology Category

Application Category

📝 Abstract
The theory of sequences, supported by many SMT solvers, can model program data types including bounded arrays and lists. Sequences are parameterized by the element data type and provide operations such as accessing elements, concatenation, forming sub-sequences and updating elements. Strings and sequences are intimately related; many operations, e.g., matching a string according to a regular expression, splitting strings, or joining strings in a sequence, are frequently used in string-manipulating programs. Nevertheless, these operations are typically not directly supported by existing SMT solvers, which instead only consider the generic theory of sequences. In this paper, we propose a theory of string sequences and study its satisfiability. We show that, while it is undecidable in general, the decidability can be recovered by restricting to the straight-line fragment. This is shown by encoding each string sequence as a string, and each string sequence operation as a corresponding string operation. We provide pre-image computation for the resulting string operations with respect to automata, effectively casting it into the generic OSTRICH string constraint solving framework. We implement the new decision procedure as a tool $ostrichseq$, and carry out experiments on benchmark constraints generated from real-world JavaScript programs, hand-crafted templates and unit tests. The experiments confirm the efficacy of our approach.
Problem

Research questions and friction points this paper is trying to address.

Extending SMT solvers to support string sequence operations
Providing decidability for string sequence satisfiability problems
Enabling automated reasoning for real-world string manipulation programs
Innovation

Methods, ideas, or system contributions that make the work stand out.

String sequences encoded as strings for solving
Straight-line fragment recovers decidability in theory
Pre-image computation integrates with OSTRICH framework
🔎 Similar Papers
No similar papers found.
D
Denghang Hu
Key Laboratory of System Software and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, China
D
Denghang Hu
University of Chinese Academy of Sciences, China
Taolue Chen
Taolue Chen
School of Computing and Mathematical Sciences, Birkbeck, University of London
Software EngineeringProgram Analysis and VerificationMachine learning
P
Philipp Rümmer
University of Regensburg, Germany
F
Fu Song
Key Laboratory of System Software and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, China
F
Fu Song
University of Chinese Academy of Sciences, China
F
Fu Song
Nanjing Institute of Software Technology, China
Zhilin Wu
Zhilin Wu
State Key Laboratory of Computer Science
Computational LogicProgram Analysis and VerificationAutomata Theory
Zhilin Wu
Zhilin Wu
State Key Laboratory of Computer Science
Computational LogicProgram Analysis and VerificationAutomata Theory