TableZoomer: A Collaborative Agent Framework for Large-scale Table Question Answering

📅 2025-09-01

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Industrial table question answering (TQA) faces three key challenges: structural heterogeneity, difficulty in target data localization, and bottlenecks in complex reasoning. To address these, this paper proposes a large language model (LLM)-based programmable agent framework. The method replaces raw textual tables with structured schema representations and introduces a query-aware dynamic subtable scaling mechanism. It integrates Program-of-Thoughts (PoT) with the ReAct paradigm to construct an executable, iterative reasoning pipeline. Column selection and entity linking are incorporated to jointly enhance semantic understanding and code generation, thereby improving localization accuracy and reasoning controllability. Evaluated on DataBench and TableBench, the framework achieves absolute accuracy gains of 19.34% and 25%, respectively, demonstrating its effectiveness and strong scalability across multi-scale tabular data.

Technology Category

Application Category

📝 Abstract

While large language models (LLMs) have shown promise in the table question answering (TQA) task through prompt engineering, they face challenges in industrial applications, including structural heterogeneity, difficulties in target data localization, and bottlenecks in complex reasoning. To address these limitations, this paper presents TableZoomer, a novel LLM-powered, programming-based agent framework. It introduces three key innovations: (1) replacing the original fully verbalized table with structured table schema to bridge the semantic gap and reduce computational complexity; (2) a query-aware table zooming mechanism that dynamically generates sub-table schema through column selection and entity linking, significantly improving target localization efficiency; and (3) a Program-of-Thoughts (PoT) strategy that transforms queries into executable code to mitigate numerical hallucination. Additionally, we integrate the reasoning workflow with the ReAct paradigm to enable iterative reasoning. Extensive experiments demonstrate that our framework maintains the usability advantages while substantially enhancing performance and scalability across tables of varying scales. When implemented with the Qwen3-8B-Instruct LLM, TableZoomer achieves accuracy improvements of 19.34% and 25% over conventional PoT methods on the large-scale DataBench dataset and the small-scale Fact Checking task of TableBench dataset, respectively.

Problem

Research questions and friction points this paper is trying to address.

Addresses structural heterogeneity in large-scale table question answering

Improves target data localization efficiency through dynamic zooming

Mitigates numerical hallucination via Program-of-Thoughts strategy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured table schema replaces verbalized tables

Query-aware zooming mechanism for dynamic sub-tables

Program-of-Thoughts strategy converts queries to code

🔎 Similar Papers

No similar papers found.

Authors to Follow