🤖 AI Summary
This work addresses the challenges of automated query rewrite rule discovery—namely, combinatorial explosion in the search space, structural redundancy, and limited scalability to complex queries involving more than five nodes. The authors propose a synergistic framework that integrates normalized query templates with Learning-to-Rank (LTR). By abstracting operator structures and stripping away data-specific details, the approach eliminates redundant candidates, while an LTR model pre-screens high-potential template pairs to dramatically improve enumeration efficiency. This method enables, for the first time, the construction of a million-scale empirically validated rewrite rule repository, automatically generating over one million rules from more than 11,000 real-world SQL queries. It scales to deeply nested, channel-level complex query templates and achieves high coverage, efficiency, and strong extensibility across diverse operators.
📝 Abstract
Query rewriting is essential for database performance optimization, but existing automated rule enumeration methods suffer from exponential search spaces, severe redundancy, and poor scalability, especially when handling complex query plans with five or more nodes, where a node represents an operator in the plan tree. We present SLER, a scalable system that enables efficient and effective rewrite rule discovery by combining standardized template enumeration with a learning to rank approach. SLER uses standardized templates, abstractions of query plans with operator structures preserved but data specific details removed, to eliminate structural redundancies and drastically reduce the search space. A learn to rank model guides enumeration by pre filtering the most promising template pairs, enabling scalable rule generation for large node templates. Evaluated on over 11000 real world SQL queries from both open source and commercial workloads, SLER has automatically constructed a rewrite rule repository exceeding 1 million rules - the largest empirically validated rewrite rule library to date. Notably, at the scale of one million rules, SLER supports query plan templates with complexity up to channel level depth. This unprecedented scale opens the door to discovering highly intricate transformations across diverse query patterns. Critically, SLER's template driven design and learned ranking mechanism are inherently extensible, allowing seamless integration of new and complex operators, paving the way for next generation optimizers powered by comprehensive, adaptive rule spaces.