Chain-of-Query: Unleashing the Power of LLMs in SQL-Aided Table Understanding via Multi-Agent Collaboration

📅 2025-08-14

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Large language models (LLMs) struggle with semantic parsing in table understanding tasks due to structural complexity, while existing multi-agent SQL generation approaches suffer from schema misinterpretation, error propagation, and overreliance on execution feedback. To address these issues, we propose Chain-of-Query (CoQ), a novel multi-agent framework comprising three core components: (1) natural-language schema representation to mitigate structural noise; (2) clause-level progressive SQL generation that decouples logical units (e.g., SELECT, WHERE); and (3) a hybrid reasoning mechanism that explicitly separates symbolic execution from semantic inference, thereby reducing dependence on query execution outcomes. Extensive experiments across four state-of-the-art LLMs and five benchmark datasets demonstrate that CoQ significantly improves SQL correctness—reducing invalid SQL generation by up to 42.6%—and achieves superior robustness and generalization in table-aware semantic parsing.

Technology Category

Application Category

📝 Abstract

Table understanding requires structured, multi-step reasoning. Large Language Models (LLMs) struggle with it due to the structural complexity of tabular data. Recently, multi-agent frameworks for SQL generation have shown promise in tackling the challenges of understanding tabular data, but existing approaches often suffer from limitations such as the inability to comprehend table structure for reliable SQL generation, error propagation that results in invalid queries, and over-reliance on execution correctness. To address these issues, we propose Chain-of-Query (CoQ), a novel multi-agent framework for SQL-aided table understanding. CoQ adopts natural-language-style representations of table schemas to abstract away structural noise and enhance understanding. It employs a clause-by-clause SQL generation strategy to improve query quality and introduces a hybrid reasoning division that separates SQL-based mechanical reasoning from LLM-based logical inference, thereby reducing reliance on execution outcomes. Extensive experiments across four models and five widely used benchmarks demonstrate that CoQ achieves substantial accuracy improvements and significantly lowers invalid SQL rates compared to prior generic LLM-based, SQL-aided, and hybrid baselines, confirming its superior effectiveness in table understanding. The code is available at https://github.com/SongyuanSui/ChainofQuery.

Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with structured reasoning for complex tabular data

Existing SQL generation approaches produce invalid queries with error propagation

Current methods over-rely on execution correctness for table understanding

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework with natural language schema representation

Clause-by-clause SQL generation for improved query quality

Hybrid reasoning separating mechanical and logical inference

🔎 Similar Papers

MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL