IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models

📅 2026-02-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses key limitations of existing Text-to-SQL approaches, which struggle with complex reasoning, integration of domain knowledge, and hypothetical queries, while also incurring high deployment costs in enterprise settings. To overcome these challenges, the authors propose the IESR framework, which leverages a lightweight, non-finetuned large language model to perform information-enhanced structured reasoning. IESR decouples mathematical computation from SQL generation, integrates schema linking with semantic understanding, and introduces a Monte Carlo Tree Search (MCTS)-based multi-path reasoning mechanism coupled with trajectory consistency verification. Without any model fine-tuning, this approach achieves state-of-the-art performance on complex reasoning benchmarks, including LogicQA (24.28 EX) and Archer (37.28 EX), demonstrating significant gains in reasoning accuracy using only lightweight models.

Technology Category

Application Category

📝 Abstract
Text-to-SQL is a key natural language processing task that maps natural language questions to SQL queries, enabling intuitive interaction with web-based databases. Although current methods perform well on benchmarks like BIRD and Spider, they struggle with complex reasoning, domain knowledge, and hypothetical queries, and remain costly in enterprise deployment. To address these issues, we propose a framework named IESR(Information Enhanced Structured Reasoning) for lightweight large language models: (i) leverages LLMs for key information understanding and schema linking, and decoupling mathematical computation and SQL generation, (ii) integrates a multi-path reasoning mechanism based on Monte Carlo Tree Search (MCTS) with majority voting, and (iii) introduces a trajectory consistency verification module with a discriminator model to ensure accuracy and consistency. Experimental results demonstrate that IESR achieves state-of-the-art performance on the complex reasoning benchmark LogicCat (24.28 EX) and the Archer dataset (37.28 EX) using only compact lightweight models without fine-tuning. Furthermore, our analysis reveals that current coder models exhibit notable biases and deficiencies in physical knowledge, mathematical computation, and common-sense reasoning, highlighting important directions for future research. We released code at https://github.com/Ffunkytao/IESR-SLM.
Problem

Research questions and friction points this paper is trying to address.

Text-to-SQL
complex reasoning
domain knowledge
hypothetical queries
enterprise deployment
Innovation

Methods, ideas, or system contributions that make the work stand out.

MCTS
modular reasoning
lightweight LLM
schema linking
trajectory consistency
🔎 Similar Papers
No similar papers found.
T
Tao Liu
Zhengzhou University
J
Jiafan Lu
Tianjin University
B
Bohan Yu
Zhengzhou University
Pengcheng Wu
Pengcheng Wu
Volvo Cars / KTH Royal Institute of Technology
motion planning and control of roboticsstate estimation and uncertainty quantificationsafety
H
Haixin Liu
Zhengzhou University
G
Guoyu Xu
Zhengzhou University
X
Xiangheng Li
Zhengzhou University
L
Lixiao Li
Zhengzhou University
J
Jiaming Hou
Zhengzhou University
S
Shijun Zhao
Zhengzhou University
Xinglin Lyu
Xinglin Lyu
PhD Student of Software Engineering, Soochow University
Machine TranslationNatural Language Processing
K
Kunli Zhang
Zhengzhou University
Yuxiang Jia
Yuxiang Jia
Zhengzhou University
Natural Language Processing
H
Hongying Zan
Zhengzhou University