Compiling Set Queries into Work-Efficient Tree Traversals

📅 2025-11-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing systems require manual implementation of pruning logic for each query predicate and data structure, hindering support for complex set queries. Method: This paper proposes a metadata-driven tree-structured query optimization framework. It employs symbolic interval analysis to automatically derive subtree pruning and containment conditions, integrates filtering and aggregation into a single traversal for the first time, and extends geometric predicate inference rules (e.g., intersection, containment) to support generic single-index and dual-index tree joins—overcoming traditional limitations of equality and range joins. Contribution/Results: The generated code is behaviorally equivalent to expert hand-optimized implementations. In absence of domain-specific optimizations, it significantly outperforms linear scans and nested-loop joins, achieving asymptotically optimal time complexity.

Technology Category

Application Category

📝 Abstract
Trees can accelerate queries that search or aggregate values over large collections. They achieve this by storing metadata that enables quick pruning (or inclusion) of subtrees when predicates on that metadata can prove that none (or all) of the data in a subtree affect the query result. Existing systems implement this pruning logic manually for each query predicate and data structure. We generalize and mechanize this class of optimization. Our method derives conditions for when subtrees can be pruned (or included wholesale), expressed in terms of the metadata available at each node. We efficiently generate these conditions using symbolic interval analysis, extended with new rules to handle geometric predicates (e.g., intersection, containment). Additionally, our compiler fuses compound queries (e.g., reductions on filters) into a single tree traversal. These techniques enable the automatic derivation of generalized single-index and dual-index tree joins that support a wide class of join predicates beyond standard equality and range predicates. The generated traversals match the behavior of expert-written code that implements query-specific traversals, and can asymptotically outperform the linear scans and nested-loop joins that existing systems fall back to when hand-written cases do not apply.
Problem

Research questions and friction points this paper is trying to address.

Automating tree traversal optimizations for efficient query execution
Generalizing pruning conditions using symbolic interval analysis techniques
Enabling compound query fusion and multi-index joins beyond standard predicates
Innovation

Methods, ideas, or system contributions that make the work stand out.

Automated subtree pruning via symbolic interval analysis
Compiler fuses compound queries into single traversal
Generalized tree joins extend beyond standard predicates
🔎 Similar Papers
No similar papers found.