🤖 AI Summary
Large language models (LLMs) struggle with semantic parsing in table understanding tasks due to structural complexity, while existing multi-agent SQL generation approaches suffer from schema misinterpretation, error propagation, and overreliance on execution feedback. To address these issues, we propose Chain-of-Query (CoQ), a novel multi-agent framework comprising three core components: (1) natural-language schema representation to mitigate structural noise; (2) clause-level progressive SQL generation that decouples logical units (e.g., SELECT, WHERE); and (3) a hybrid reasoning mechanism that explicitly separates symbolic execution from semantic inference, thereby reducing dependence on query execution outcomes. Extensive experiments across four state-of-the-art LLMs and five benchmark datasets demonstrate that CoQ significantly improves SQL correctness—reducing invalid SQL generation by up to 42.6%—and achieves superior robustness and generalization in table-aware semantic parsing.
📝 Abstract
Table understanding requires structured, multi-step reasoning. Large Language Models (LLMs) struggle with it due to the structural complexity of tabular data. Recently, multi-agent frameworks for SQL generation have shown promise in tackling the challenges of understanding tabular data, but existing approaches often suffer from limitations such as the inability to comprehend table structure for reliable SQL generation, error propagation that results in invalid queries, and over-reliance on execution correctness. To address these issues, we propose Chain-of-Query (CoQ), a novel multi-agent framework for SQL-aided table understanding. CoQ adopts natural-language-style representations of table schemas to abstract away structural noise and enhance understanding. It employs a clause-by-clause SQL generation strategy to improve query quality and introduces a hybrid reasoning division that separates SQL-based mechanical reasoning from LLM-based logical inference, thereby reducing reliance on execution outcomes. Extensive experiments across four models and five widely used benchmarks demonstrate that CoQ achieves substantial accuracy improvements and significantly lowers invalid SQL rates compared to prior generic LLM-based, SQL-aided, and hybrid baselines, confirming its superior effectiveness in table understanding. The code is available at https://github.com/SongyuanSui/ChainofQuery.