Efficient Table QA via TableGrid Navigation and Progressive Inference Prompting

📅 2026-05-18

📈 Citations: 0

✨ Influential: 0

career value

155K/year

🤖 AI Summary

This work addresses the challenges large language models face in table-based question answering, particularly their difficulty in precisely locating cells and performing multi-step structured reasoning, compounded by a lack of verifiable reasoning control mechanisms. The authors propose two training-free structured prompting frameworks: TableGrid Navigation, which employs a three-module iterative loop for row-column navigation and answer refinement, and Progressive Inference Prompting, which explicitly constrains column identification and row selection based on the query to enable step-by-step reasoning. Notably, these methods introduce the first interpretable and verifiable reasoning-path control mechanism, offering both plug-and-play usability and compatibility with supervised fine-tuning templates. Evaluated on TableBench, the approach outperforms the strongest baseline by 3.8 points, achieves state-of-the-art results on FeTaQA, and significantly enhances smaller models’ performance, markedly narrowing the gap with larger counterparts.

📝 Abstract

Large Language Models (LLMs) have shown promising results on NLP tasks, however, their performance on tabular data still needs research attention, because Table Question-Answering (TQA) requires precise cell retrieval and multi-step structured reasoning. Existing work improves TQA either by fine-tuning or training LLMs on task-specific tabular data, but often lacks verifiable control over how the model navigates tables and derives answers. In this work, we propose a training-free TQA approach with two structured prompting frameworks: TableGrid Navigation (TGN), which iteratively navigates rows and columns via a three-module loop to locate evidence and refine answers, and Progressive Inference Prompting (PIP), which enforces columns identification for explicit progressive row selection constraint according to the query. We evaluate 17 LLMs against 6 baselines on TableBench and FeTaQa dataset. On TableBench, TGN improves over the strongest baseline by 3.8 points, and on FeTaQa, PIP achieves SOTA performance over ReAct and Chain-of-Thought. Beyond inference-time gains, PIP and TGN can also serve as supervision templates to fine-tune small models, narrowing the performance gap to much larger architectures in resource-constrained settings, offering versatile and cost-efficient solution for TQA.

Problem

Research questions and friction points this paper is trying to address.

Table Question Answering

Large Language Models

Structured Reasoning

Cell Retrieval

Tabular Data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Table Question Answering

Prompt Engineering

Training-Free Method