🤖 AI Summary
FC-Datalog, a string-logic query language, suffers from limited expressiveness and uncontrolled computational complexity.
Method: We propose a modular optimization framework targeting LOGSPACE. First, we extend FC logic with recursion to enable recursive definition and efficient evaluation of core spanners. Second, we design several syntactically restricted FC-Datalog fragments—achieving, for the first time in the string context, both LOGSPACE decidability and linear combined complexity. Third, we integrate word equations, regular constraints, recursive Datalog, and model-checking optimizations.
Contribution/Results: We establish the first string-recursive query framework that is both theoretically rigorous—precisely characterizing LOGSPACE—and practically viable, supporting deterministic regular expression simulation and application-specific customization. Our framework significantly improves model-checking efficiency and enables modular, compositional verification of string-manipulating programs.
📝 Abstract
Core spanners are a class of document spanners that capture the core functionality of IBM's AQL. FC is a logic on strings built around word equations that when extended with constraints for regular languages can be seen as a logic for core spanners. The recently introduced FC-Datalog extends FC with recursion, which allows us to define recursive relations for core spanners. Additionally, as FC-Datalog captures P, it is also a tractable version of Datalog on strings. This presents an opportunity for optimization. We propose a series of FC-Datalog fragments with desirable properties in terms of complexity of model checking, expressive power, and efficiency of checking membership in the fragment. This leads to a range of fragments that all capture LOGSPACE, which we further restrict to obtain linear combined complexity. This gives us a framework to tailor fragments for particular applications. To showcase this, we simulate deterministic regex in a tailored fragment of FC-Datalog.