Output-Sensitive Evaluation of Regular Path Queries

📅 2024-12-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This paper addresses the efficient evaluation of Regular Path Queries (RPQs) in graph databases, targeting redundant computation in traditional Product Graph (PG) methods—especially under small-output scenarios. We propose OSPG, an output-sensitive algorithm that integrates product automaton construction with enhanced bidirectional reachability search, augmented by output-size-aware pruning and sparse-graph adaptation. Theoretically, OSPG achieves data complexity O(|E|^{3/2} + min(OUT·√|E|, |V||E|)) for general RPQs, and further improves to O(|E| + |E|·√OUT) for Kleene-star-free queries—strictly outperforming classical PG methods and yielding asymptotic speedups for sparse outputs. Experimental evaluation on real-world graphs confirms substantial reductions in both runtime and memory access overhead.

Technology Category

Application Category

📝 Abstract

We study the classical evaluation problem for regular path queries: Given an edge-labeled graph and a regular path query, compute the set of pairs of vertices that are connected by paths that match the query. The Product Graph (PG) is the established evaluation approach for regular path queries. PG first constructs the product automaton of the data graph and the query and then uses breadth-first search to find the accepting states reachable from each initial state in the product automaton. Its data complexity is O(|V|.|E|), where V and E are the sets of vertices and respectively edges in the data graph. This complexity cannot be improved by combinatorial algorithms. In this paper, we introduce OSPG, an output-sensitive refinement of PG, whose data complexity is O(|E|^{3/2} + min(OUT.sqrt{|E|}, |V|.|E|)), where OUT is the number of distinct vertex pairs in the query output. OSPG's complexity is at most that of PG and can be asymptotically smaller for small output and sparse input. The improvement of OSPG over PG is due to the unnecessary time wasted by PG in the breadth-first search phase, in case a few output pairs are eventually discovered. For queries without Kleene star, the complexity of OSPG can be further improved to O(|E| + |E|.sqrt{OUT}).

Problem

Research questions and friction points this paper is trying to address.

Efficiently compute regular path query results on graphs

Reduce time complexity for sparse graphs and small outputs

Optimize breadth-first search in product automaton evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Output-sensitive refinement of Product Graph

Improved complexity O(E^(3/2) + min(OUT.sqrt(E), V.E))

Further optimization for non-Kleene-star queries

🔎 Similar Papers

PathFinder: A unified approach for handling paths in graph query languages