TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models

📅 2026-02-15

📈 Citations: 0

✨ Influential: 0

career value

175K/year

🤖 AI Summary

This work proposes TabTracer, a novel framework addressing key limitations of large language models in tabular reasoning—namely error propagation, weak backtracking capability, redundant reasoning trajectories, and excessive token consumption. TabTracer enables efficient and verifiable reasoning by explicitly tracking intermediate table states and integrating execution feedback through Monte Carlo Tree Search (MCTS). The framework introduces several innovations: a state snapshot rollback mechanism, UCB1-based action selection, and a monotonicity-gated deduplication strategy. It further incorporates typed operation validation, lightweight numerical checks, state hashing, and budget-aware pruning to enhance reliability and efficiency. Evaluated on TabFact, WikiTQ, and CRT benchmarks, TabTracer achieves up to a 6.7% absolute improvement in accuracy while reducing token usage by 59%–84% compared to existing approaches.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) have emerged as powerful tools for natural language table reasoning, where there are two main categories of methods. Prompt-based approaches rely on language-only inference or one-pass program generation without step-level verification. Agent-based approaches use tools in a closed loop, but verification is often local and backtracking is limited, allowing errors to propagate and increasing cost. Moreover, they rely on chain- or beam-style trajectories that are typically combinatorially redundant, leading to high token costs. In this paper, we propose TabTracer, an agentic framework that coordinates multi-step tool calls over intermediate table states, with explicit state tracking for verification and rollback. First, it enforces step-level verification with typed operations and lightweight numeric and format checks to provide reliable rewards and suppress hallucinations. Second, execution-feedback Monte Carlo Tree Search maintains a search tree of candidate table states and uses backpropagated reflection scores to guide UCB1 selection and rollback via versioned snapshots. Third, it reduces redundancy with budget-aware pruning, deduplication, and state hashing with a monotonicity gate to cut token cost. Comprehensive evaluation on TabFact, WikiTQ, and CRT datasets shows that TabTracer outperforms state-of-the-art baselines by up to 6.7% in accuracy while reducing token consumption by 59--84%.

Problem

Research questions and friction points this paper is trying to address.

table reasoning

error propagation

backtracking

token cost

redundancy

Innovation

Methods, ideas, or system contributions that make the work stand out.

Monte Carlo Tree Search

table reasoning

step-level verification