Parachute: Single-Pass Bi-Directional Information Passing

📅 2025-06-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing query execution frameworks restrict inter-operator information propagation to unidirectional data flow, while instance-optimal algorithms enabling bidirectional propagation (e.g., Yannakakis) require additional I/O passes and thus suffer from poor practicality. This paper introduces the first execution framework supporting bidirectional pruning information flow within a single scan. It statically analyzes the query plan to identify blocking operators, then leverages precomputed foreign-key (FK) fingerprint columns and semi-join filtering to enable efficient bidirectional propagation—without incurring extra I/O traversals. Adopting a space-for-time trade-off, the approach incurs only 15% storage overhead. Evaluated on the Join Order Benchmark (JOB), it achieves end-to-end speedups of 1.54× (without semi-joins) and 1.24× (with semi-joins) over DuckDB v1.2, significantly overcoming the limitations of conventional unidirectional pruning.

Technology Category

Application Category

📝 Abstract
Sideways information passing is a well-known technique for mitigating the impact of large build sides in a database query plan. As currently implemented in production systems, sideways information passing enables only a uni-directional information flow, as opposed to instance-optimal algorithms, such as Yannakakis'. On the other hand, the latter require an additional pass over the input, which hinders adoption in production systems. In this paper, we make a step towards enabling single-pass bi-directional information passing during query execution. We achieve this by statically analyzing between which tables the information flow is blocked and by leveraging precomputed join-induced fingerprint columns on FK-tables. On the JOB benchmark, Parachute improves DuckDB v1.2's end-to-end execution time without and with semi-join filtering by 1.54x and 1.24x, respectively, when allowed to use 15% extra space.
Problem

Research questions and friction points this paper is trying to address.

Enables single-pass bi-directional information passing in queries
Mitigates large build sides impact via static flow analysis
Improves execution time using precomputed fingerprint columns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Single-pass bi-directional information passing
Static analysis of blocked information flow
Precomputed join-induced fingerprint columns