🤖 AI Summary
Industrial table question answering (TQA) faces three key challenges: structural heterogeneity, difficulty in target data localization, and bottlenecks in complex reasoning. To address these, this paper proposes a large language model (LLM)-based programmable agent framework. The method replaces raw textual tables with structured schema representations and introduces a query-aware dynamic subtable scaling mechanism. It integrates Program-of-Thoughts (PoT) with the ReAct paradigm to construct an executable, iterative reasoning pipeline. Column selection and entity linking are incorporated to jointly enhance semantic understanding and code generation, thereby improving localization accuracy and reasoning controllability. Evaluated on DataBench and TableBench, the framework achieves absolute accuracy gains of 19.34% and 25%, respectively, demonstrating its effectiveness and strong scalability across multi-scale tabular data.
📝 Abstract
While large language models (LLMs) have shown promise in the table question answering (TQA) task through prompt engineering, they face challenges in industrial applications, including structural heterogeneity, difficulties in target data localization, and bottlenecks in complex reasoning. To address these limitations, this paper presents TableZoomer, a novel LLM-powered, programming-based agent framework. It introduces three key innovations: (1) replacing the original fully verbalized table with structured table schema to bridge the semantic gap and reduce computational complexity; (2) a query-aware table zooming mechanism that dynamically generates sub-table schema through column selection and entity linking, significantly improving target localization efficiency; and (3) a Program-of-Thoughts (PoT) strategy that transforms queries into executable code to mitigate numerical hallucination. Additionally, we integrate the reasoning workflow with the ReAct paradigm to enable iterative reasoning. Extensive experiments demonstrate that our framework maintains the usability advantages while substantially enhancing performance and scalability across tables of varying scales. When implemented with the Qwen3-8B-Instruct LLM, TableZoomer achieves accuracy improvements of 19.34% and 25% over conventional PoT methods on the large-scale DataBench dataset and the small-scale Fact Checking task of TableBench dataset, respectively.