🤖 AI Summary
Although structural decomposition methods offer theoretical advantages, their integration into cost-based query optimization frameworks remains challenging, limiting practical applicability. This work proposes a novel representation called *meta-decomposition*, which compactly captures all join trees of acyclic queries in linear size for the first time, and introduces a new width measure that theoretically bounds both plan complexity and intermediate result sizes. Building on this foundation, we design a polynomial-time construction algorithm and an efficient cost-based optimization strategy that operates directly on the meta-decomposition, thereby avoiding explicit enumeration of join trees. Experimental results demonstrate that our approach produces execution plans for large, complex queries whose quality matches or even surpasses that of dynamic programming–based optimal solutions, while achieving planning speeds several orders of magnitude faster—approaching the efficiency of heuristic methods.
📝 Abstract
Structural decomposition methods offer powerful theoretical guarantees for join evaluation, yet they are rarely used in real-world query optimizers. A major reason is the difficulty of combining cost-based plan search and structure-based evaluation. In this work, we bridge this gap by introducing meta-decompositions for acyclic queries, a novel representation that succinctly represents all possible join trees and enables their efficient enumeration. Meta-decompositions can be constructed in polynomial time and have sizes linear in the query size. We design an efficient polynomial-time cost-based optimizer based directly on the meta-decomposition, without the need to explicitly enumerate all possible join trees. We characterize plans found by this approach using a novel notion of width, which effectively implies the theoretical worst-case asymptotic bounds of intermediate result sizes and running time of any query plan. Experimental results demonstrate that, in practice, the plans in our class are consistently comparable to -- even in many cases better than -- the optimal ones found by the state-of-the-art dynamic programming approach, especially on large and complex queries, while our planning process runs by orders of magnitude faster, comparable to the time taken by common heuristic methods.