Enhancing TableQA through Verifiable Reasoning Trace Reward

📅 2026-01-30

📈 Citations: 0

✨ Influential: 0

career value

152K/year

🤖 AI Summary

This work addresses the challenges of complex reasoning in TableQA, which requires multi-step transformations of table states and faces difficulties in environment interaction and inference efficiency. The authors propose RE-Tab, a novel framework that formulates TableQA as a partially observable Markov decision process for the first time. RE-Tab introduces a lightweight, training-free trajectory reward mechanism that provides verifiable reward signals during state transitions and simulated reasoning, thereby guiding large language models toward efficient and trustworthy step-by-step inference. The approach is plug-and-play and compatible with various large language models. Experimental results demonstrate state-of-the-art performance across multiple benchmarks, achieving up to a 41.77% improvement in question-answering accuracy, a 33.33% reduction in test-time reasoning samples, and nearly a 25% decrease in inference cost.

Technology Category

Application Category

📝 Abstract

A major challenge in training TableQA agents, compared to standard text- and image-based agents, is that answers cannot be inferred from a static input but must be reasoned through stepwise transformations of the table state, introducing multi-step reasoning complexity and environmental interaction. This leads to a research question: Can explicit feedback on table transformation action improve model reasoning capability? In this work, we introduce RE-Tab, a plug-and-play framework that architecturally enhances trajectory search via lightweight, training-free reward modeling by formulating the problem as a Partially Observable Markov Decision Process. We demonstrate that providing explicit verifiable rewards during State Transition (``What is the best action?'') and Simulative Reasoning (``Am I sure about the output?'') is crucial to steer the agent's navigation in table states. By enforcing stepwise reasoning with reward feedback in table transformations, RE-Tab achieves state-of-the-art performance in TableQA with almost 25\% drop in inference cost. Furthermore, a direct plug-and-play implementation of RE-Tab brings up to 41.77% improvement in QA accuracy and 33.33% drop in test-time inference samples for consistent answer. Consistent improvement pattern across various LLMs and state-of-the-art benchmarks further confirms RE-Tab's generalisability. The repository is available at https://github.com/ThomasK1018/RE_Tab .

Problem

Research questions and friction points this paper is trying to address.

TableQA

multi-step reasoning

table transformation

reasoning complexity

environmental interaction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Verifiable Reasoning

Reward Modeling

TableQA