Including Bloom Filters in Bottom-up Optimization

📅 2025-05-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses two key challenges in bottom-up, cost-driven query optimization: the difficulty of integrating Bloom filters and the exponential explosion of the search space. We propose the first general-purpose Bloom-aware optimization framework, supporting Bloom-aware join-order selection and cross-operator predicate propagation for arbitrary SQL queries—overcoming prior limitations that restrict Bloom filter usage to snowflake queries or post-optimization phases. Our approach extends the cost model to account for Bloom filter selectivity, models Bloom filter propagation across operators, enables dynamic predicate pushdown conditioned on Bloom filter presence, and incorporates heuristic pruning to effectively curb search-space growth. Evaluated on TPC-H 100GB, our framework reduces query latency by 32.8% compared to conventional post-optimization methods, with only moderate overhead in optimization time, thereby significantly improving end-to-end query performance and optimization efficiency.

Technology Category

Application Category

📝 Abstract
Bloom filters are used in query processing to perform early data reduction and improve query performance. The optimal query plan may be different when Bloom filters are used, indicating the need for Bloom filter-aware query optimization. To date, Bloom filter-aware query optimization has only been incorporated in a top-down query optimizer and limited to snowflake queries. In this paper, we show how Bloom filters can be incorporated in a bottom-up cost-based query optimizer. We highlight the challenges in limiting optimizer search space expansion, and offer an efficient solution. We show that including Bloom filters in cost-based optimization can lead to better join orders with effective predicate transfer between operators. On a 100 GB instance of the TPC-H database, our approach achieved a 32.8% further reduction in latency for queries involving Bloom filters, compared to the traditional approach of adding Bloom filters in a separate post-optimization step. Our method applies to all query types, and we provide several heuristics to balance limited increases in optimization time against improved query latency.
Problem

Research questions and friction points this paper is trying to address.

Incorporating Bloom filters in bottom-up query optimization
Limiting optimizer search space expansion efficiently
Improving join orders with predicate transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Incorporating Bloom filters in bottom-up optimization
Efficiently limiting optimizer search space expansion
Balancing optimization time and query latency
🔎 Similar Papers
2024-09-10IACR Cryptology ePrint ArchiveCitations: 0
T
Tim Zeyl
Huawei, Cloud BU, Markham, Canada
Q
Qi Cheng
Huawei, Cloud BU, Markham, Canada
R
Reza Pournaghi
Huawei, Cloud BU, Markham, Canada
J
Jason Lam
Huawei, Cloud BU, Markham, Canada
Weicheng Wang
Weicheng Wang
Research assistant, Purdue University
SecurityNetwork
C
Calvin Wong
Huawei, Cloud BU, Markham, Canada
C
Chong Chen
Huawei, Cloud BU, Markham, Canada
P
Per-Åke Larson
Huawei, Cloud BU, Markham, Canada