🤖 AI Summary
This work addresses the challenge of effectively analyzing the size and asymptotic complexity of intermediate results in recursive Datalog queries. The authors propose the EDB-bounded Datalog framework, which annotates each rule with a non-recursive conjunctive query as an upper-bound decorator. This enables the derivation of polynomial upper bounds on the size of IDB predicate results and introduces a complexity analysis model based on (integral or fractional) edge cover width. By uniquely integrating upper-bound decorators with edge cover width, the approach yields fixed-parameter tractable and output-sensitive complexity upper bounds for recursive Datalog programs. It supports efficient semi-decision procedures for boundedness and can rewrite most practically bounded programs into equivalent non-recursive forms.
📝 Abstract
We introduce EDB-bounded datalog, a framework for deriving upper bounds on intermediate result sizes and the asymptotic complexity of recursive queries in datalog. We present an algorithm that, given an arbitrary datalog program, constructs an EDB-bounded datalog program in which every rule is adorned with a (non-recursive) conjunctive query that subsumes the result of the rule, thus acting as an upper bound. From such adornments, we define a notion of width based on (integral or fractional) edge-cover widths. Through the adornments and the width measure, we obtain, for every IDB predicate, worst-case upper bounds on their sizes, which are polynomial in the input data size, given a fixed program structure. Furthermore, with these size bounds, we also derive fixed-parameter tractable, output-sensitive asymptotic complexity bounds for evaluating the entire program. Additionally, by adapting our framework, we obtain a semi-decision procedure for datalog boundedness that efficiently rewrites most practical bounded programs into non-recursive equivalent programs.