Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries

📅 2023-07-28
🏛️ Conference on Innovative Data Systems Research
📈 Citations: 18
Influential: 1
📄 PDF
🤖 AI Summary
To address performance bottlenecks in multi-way join queries caused by large input sizes, this paper proposes a predicate migration–based pre-filtering optimization leveraging Bloom filters. The method replaces computationally expensive semi-joins with lightweight Bloom filters, constructing a general pre-filtering framework grounded in the query’s join graph. Its key contribution is the first extension of Bloom Join to arbitrary join topologies—including cyclic ones—thereby overcoming the theoretical limitation of Yannakakis’ semi-join algorithm, which only guarantees correctness for acyclic queries. Unlike prior Bloom Join variants restricted to specific query structures, our approach ensures correctness and completeness for arbitrary SQL joins without sacrificing result accuracy. Experimental evaluation on the TPC-H benchmark demonstrates an average speedup of 3.1× over baseline execution, significantly outperforming conventional Bloom Join implementations while strictly preserving query result equivalence.
📝 Abstract
This paper presents predicate transfer, a novel method that optimizes join performance by pre-filtering tables to reduce the join input sizes. Predicate transfer generalizes Bloom join, which conducts pre-filtering within a single join operation, to multi-table joins such that the filtering benefits can be significantly increased. Predicate transfer is inspired by the seminal theoretical results by Yannakakis, which uses semi-joins to pre-filter acyclic queries. Predicate transfer generalizes the theoretical results to any join graphs and use Bloom filters to replace semi-joins leading to significant speedup. Evaluation shows predicate transfer can outperform Bloom join by 3.1x on average on TPC-H benchmark.
Problem

Research questions and friction points this paper is trying to address.

Optimizes multi-join queries via pre-filtering
Generalizes Bloom join to any join graphs
Uses Bloom filters to replace semi-joins
Innovation

Methods, ideas, or system contributions that make the work stand out.

Predicate transfer pre-filters tables to reduce join input sizes.
It generalizes Bloom join to multi-table joins for increased filtering benefits.
It replaces semi-joins with Bloom filters for speedup on any join graphs.
🔎 Similar Papers
No similar papers found.