IESR:Efficient MCTS-Based Modular Reasoning for Text-to-SQL with Large Language Models

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work addresses key limitations of existing Text-to-SQL approaches, which struggle with complex reasoning, integration of domain knowledge, and hypothetical queries, while also incurring high deployment costs in enterprise settings. To overcome these challenges, the authors propose the IESR framework, which leverages a lightweight, non-finetuned large language model to perform information-enhanced structured reasoning. IESR decouples mathematical computation from SQL generation, integrates schema linking with semantic understanding, and introduces a Monte Carlo Tree Search (MCTS)-based multi-path reasoning mechanism coupled with trajectory consistency verification. Without any model fine-tuning, this approach achieves state-of-the-art performance on complex reasoning benchmarks, including LogicQA (24.28 EX) and Archer (37.28 EX), demonstrating significant gains in reasoning accuracy using only lightweight models.

Technology Category

Application Category

📝 Abstract

Text-to-SQL is a key natural language processing task that maps natural language questions to SQL queries, enabling intuitive interaction with web-based databases. Although current methods perform well on benchmarks like BIRD and Spider, they struggle with complex reasoning, domain knowledge, and hypothetical queries, and remain costly in enterprise deployment. To address these issues, we propose a framework named IESR(Information Enhanced Structured Reasoning) for lightweight large language models: (i) leverages LLMs for key information understanding and schema linking, and decoupling mathematical computation and SQL generation, (ii) integrates a multi-path reasoning mechanism based on Monte Carlo Tree Search (MCTS) with majority voting, and (iii) introduces a trajectory consistency verification module with a discriminator model to ensure accuracy and consistency. Experimental results demonstrate that IESR achieves state-of-the-art performance on the complex reasoning benchmark LogicCat (24.28 EX) and the Archer dataset (37.28 EX) using only compact lightweight models without fine-tuning. Furthermore, our analysis reveals that current coder models exhibit notable biases and deficiencies in physical knowledge, mathematical computation, and common-sense reasoning, highlighting important directions for future research. We released code at https://github.com/Ffunkytao/IESR-SLM.

Problem

Research questions and friction points this paper is trying to address.

Text-to-SQL

complex reasoning

domain knowledge

hypothetical queries

enterprise deployment

Innovation

Methods, ideas, or system contributions that make the work stand out.

MCTS

modular reasoning

lightweight LLM