🤖 AI Summary
Path matching in graph query languages (e.g., Cypher, SQL/PGQ, GQL) lacks a unified and efficient processing mechanism—particularly when supporting complex path semantics (e.g., shortest paths, simple paths) and regular-expression constraints on edge labels—posing dual challenges in expressive power and performance. This paper introduces the first cross-language, general-purpose path-solving framework. It features a compact symbolic path representation and integrates dynamic-programming-based enumeration, incremental pipelined execution, and regex compilation optimizations to enable unified modeling and efficient evaluation of diverse path semantics and edge-label constraints. Experimental evaluation on real-world datasets and complex queries demonstrates an order-of-magnitude speedup over state-of-the-art graph engines, while maintaining high expressiveness, strong scalability, and behavioral stability.
📝 Abstract
Path queries are a core feature of modern graph query languages such as Cypher, SQL/PGQ, and GQL. These languages provide a rich set of features for matching paths, such as restricting to certain path modes (shortest, simple, trail) and constraining the edge labels along the path by a regular expression. In this paper we present PathFinder, a unifying approach for dealing with path queries in all these query languages. PathFinder leverages a compact representation of the (potentially exponential number of) paths that can match a given query, extends it with pipelined execution, and supports all commonly used path modes. In the paper we describe the algorithmic backbone of PathFinder, provide a reference implementation, and test it over a large set of real-world queries and datasets. Our results show that PathFinder exhibits very stable behavior, even on large data and complex queries, and its performance is an order of magnitude better than that of many modern graph engines.