ReEx-SQL: Reasoning with Execution-Aware Reinforcement Learning for Text-to-SQL

📅 2025-05-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing Text-to-SQL methods lack real-time integration of execution feedback during inference, leading to error accumulation and degraded accuracy and robustness. To address this, we propose an execution-aware reinforcement learning framework that dynamically connects to the database during decoding to obtain immediate SQL execution feedback and adaptively refine the reasoning path. Our approach introduces a structured prompt template with syntactic markers, a stepwise rollout strategy, a composite reward function incorporating exploration incentives, and tree-based decoding to enable multi-path reasoning exploration. Evaluated on Spider, BIRD, and Spider-Realistic, our method achieves 88.8%, 64.9%, and 85.2% exact match accuracy, respectively—surpassing baseline methods by 2.6–2.7% on a 7B-language-model backbone. Moreover, tree decoding reduces average inference latency on BIRD by 51.9%, demonstrating significant efficiency gains without compromising correctness.

Technology Category

Application Category

📝 Abstract
In Text-to-SQL, execution feedback is essential for guiding large language models (LLMs) to reason accurately and generate reliable SQL queries. However, existing methods treat execution feedback solely as a post-hoc signal for correction or selection, failing to integrate it into the generation process. This limitation hinders their ability to address reasoning errors as they occur, ultimately reducing query accuracy and robustness. To address this issue, we propose ReEx-SQL (Reasoning with Execution-Aware Reinforcement Learning), a framework for Text-to-SQL that enables models to interact with the database during decoding and dynamically adjust their reasoning based on execution feedback. ReEx-SQL introduces an execution-aware reasoning paradigm that interleaves intermediate SQL execution into reasoning paths, facilitating context-sensitive revisions. It achieves this through structured prompts with markup tags and a stepwise rollout strategy that integrates execution feedback into each stage of generation. To supervise policy learning, we develop a composite reward function that includes an exploration reward, explicitly encouraging effective database interaction. Additionally, ReEx-SQL adopts a tree-based decoding strategy to support exploratory reasoning, enabling dynamic expansion of alternative reasoning paths. Notably, ReEx-SQL achieves 88.8% on Spider and 64.9% on BIRD at the 7B scale, surpassing the standard reasoning baseline by 2.7% and 2.6%, respectively. It also shows robustness, achieving 85.2% on Spider-Realistic with leading performance. In addition, its tree-structured decoding improves efficiency and performance over linear decoding, reducing inference time by 51.9% on the BIRD development set.
Problem

Research questions and friction points this paper is trying to address.

Integrate execution feedback into Text-to-SQL generation process
Address reasoning errors dynamically during SQL query generation
Improve query accuracy and robustness with execution-aware reinforcement learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Execution-aware reinforcement learning for dynamic SQL adjustment
Structured prompts with markup tags for context-sensitive revisions
Tree-based decoding strategy for exploratory reasoning paths
🔎 Similar Papers
No similar papers found.
Y
Yaxun Dai
Soochow University
W
Wenxuan Xie
South China University of Technology
Xialie Zhuang
Xialie Zhuang
Ubiquant
LLM
T
Tianyu Yang
Alibaba DAMO Academy
Yiying Yang
Yiying Yang
Fudan university
3D computer visionmachine learning
H
Haiqin Yang
International Digital Economy Academy (IDEA), China
Y
Yuhang Zhao
Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ)
Pingfu Chao
Pingfu Chao
Associate Professor, Soochow University, China
DatabaseData MiningSpatial-Temporal Data Management
Wenhao Jiang
Wenhao Jiang
GML, Tencent, PolyU
Computer VisionMachine LearningFoundation Models