AQORA: A Learned Adaptive Query Optimizer for Spark SQL

📅 2025-10-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current query optimization faces two key bottlenecks: learning-based query optimization (LQO) relies on static plans and suffers from low learning efficiency due to the absence of runtime cardinality feedback; adaptive query processing (AQP) over-relies on rule-based heuristics and lacks empirical learning capability. This paper proposes a reinforcement learning–driven adaptive optimization framework for Spark SQL, innovatively unifying LQO and AQP. Our approach incorporates real-execution feature encoding, phased feedback intervention, policy auto-adaptation, and a low-overhead integration mechanism to enable fine-grained, runtime dynamic plan tuning. Experimental results demonstrate that our method reduces end-to-end execution time by up to 90% compared to state-of-the-art learning-based optimizers, and by up to 70% relative to Spark’s default AQP configuration. It significantly improves both optimization efficiency and generalization across diverse workloads.

Technology Category

Application Category

📝 Abstract
Recent studies have identified two main approaches to improve query optimization: learned query optimization (LQO), which generates or selects better query plans before execution based on models trained in advance, and adaptive query processing (AQP), which adapts the query plan during execution based on statistical feedback collected at runtime. Although both approaches have shown promise, they also face critical limitations. LQO must commit to a fixed plan without access to actual cardinalities and typically rely on a single end-to-end feedback signal, making learning inefficient. On the other hand, AQP depends heavily on rule-based heuristics and lacks the ability to learn from experience. In this paper, we present AQORA, an adaptive query optimizer with a reinforcement learning architecture that combines the strengths of both LQO and AQP. AQORA addresses the above challenges through four core strategies: (1) realistic feature encoding, (2) query stage-level feedback and intervention, (3) automatic strategy adaptation, and (4) low-cost integration. Experiments show that AQORA reduces end-to-end execution time by up to 90% compared to other learned methods and by up to 70% compared to Spark SQL's default configuration with adaptive query execution.
Problem

Research questions and friction points this paper is trying to address.

Combining learned and adaptive query optimization approaches
Addressing limitations of static plans and runtime adaptation
Improving Spark SQL performance through reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning architecture combining LQO and AQP
Query stage-level feedback with automatic strategy adaptation
Realistic feature encoding with low-cost integration
🔎 Similar Papers
No similar papers found.
J
Jiahao He
School of Information, Renmin University of China, Beijing, China
Yutao Cui
Yutao Cui
Tencent Hunyuan
Generative ModelsMulti-ModalObject Tracking
Cuiping Li
Cuiping Li
Renmin University of China
Databasebig data analysis and mining
J
Jikang Jiang
School of Information, Renmin University of China, Beijing, China
Y
Yuheng Hou
School of Information, Renmin University of China, Beijing, China
H
Hong Chen
School of Information, Renmin University of China, Beijing, China