Chain-of-Query: Unleashing the Power of LLMs in SQL-Aided Table Understanding via Multi-Agent Collaboration

📅 2025-08-14
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) struggle with semantic parsing in table understanding tasks due to structural complexity, while existing multi-agent SQL generation approaches suffer from schema misinterpretation, error propagation, and overreliance on execution feedback. To address these issues, we propose Chain-of-Query (CoQ), a novel multi-agent framework comprising three core components: (1) natural-language schema representation to mitigate structural noise; (2) clause-level progressive SQL generation that decouples logical units (e.g., SELECT, WHERE); and (3) a hybrid reasoning mechanism that explicitly separates symbolic execution from semantic inference, thereby reducing dependence on query execution outcomes. Extensive experiments across four state-of-the-art LLMs and five benchmark datasets demonstrate that CoQ significantly improves SQL correctness—reducing invalid SQL generation by up to 42.6%—and achieves superior robustness and generalization in table-aware semantic parsing.

Technology Category

Application Category

📝 Abstract
Table understanding requires structured, multi-step reasoning. Large Language Models (LLMs) struggle with it due to the structural complexity of tabular data. Recently, multi-agent frameworks for SQL generation have shown promise in tackling the challenges of understanding tabular data, but existing approaches often suffer from limitations such as the inability to comprehend table structure for reliable SQL generation, error propagation that results in invalid queries, and over-reliance on execution correctness. To address these issues, we propose Chain-of-Query (CoQ), a novel multi-agent framework for SQL-aided table understanding. CoQ adopts natural-language-style representations of table schemas to abstract away structural noise and enhance understanding. It employs a clause-by-clause SQL generation strategy to improve query quality and introduces a hybrid reasoning division that separates SQL-based mechanical reasoning from LLM-based logical inference, thereby reducing reliance on execution outcomes. Extensive experiments across four models and five widely used benchmarks demonstrate that CoQ achieves substantial accuracy improvements and significantly lowers invalid SQL rates compared to prior generic LLM-based, SQL-aided, and hybrid baselines, confirming its superior effectiveness in table understanding. The code is available at https://github.com/SongyuanSui/ChainofQuery.
Problem

Research questions and friction points this paper is trying to address.

LLMs struggle with structured reasoning for complex tabular data
Existing SQL generation approaches produce invalid queries with error propagation
Current methods over-rely on execution correctness for table understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework with natural language schema representation
Clause-by-clause SQL generation for improved query quality
Hybrid reasoning separating mechanical and logical inference
🔎 Similar Papers
2023-12-18International Conference on Computational LinguisticsCitations: 35
S
Songyuan Sui
Rice University
H
Hongyi Liu
Rice University
S
Serena Liu
Rice University
L
Li Li
Samsung Electronics America
Soo-Hyun Choi
Soo-Hyun Choi
Director, Machine Learning Engineering at Warner Bros. Discovery
Machine LearningDeep LearningAdTechComplex Networks
R
Rui Chen
Samsung Electronics America
Xia Hu
Xia Hu
Google DeepMind
Deep LearningMachine LearningMultimodal