🤖 AI Summary
Existing table understanding methods suffer from a dichotomy: fine-tuning paradigms often yield arithmetic errors and hallucinations, while tool-calling approaches lack semantic comprehension and rely on rigid templates. To address this, we propose Mixture-of-Minds, a multi-agent architecture that decomposes the task into three specialized roles—planning, coding, and answering—enabling tight coordination between semantic understanding and precise execution via a code-execution feedback loop. We introduce a novel self-optimizing training mechanism based on Monte Carlo Tree Search (MCTS), dynamically integrating reasoning and execution. The framework is trained end-to-end by jointly leveraging large language models, multi-agent reinforcement learning, and an executable environment. On the TableBench benchmark, our method achieves 62.13% accuracy—significantly outperforming OpenAI-o4-mini-high—and demonstrates unified improvements in both reasoning robustness and execution reliability.
📝 Abstract
Understanding and reasoning over tables is a critical capability for many real-world applications. Large language models (LLMs) have shown promise on this task, but current approaches remain limited. Fine-tuning based methods strengthen language reasoning; yet they are prone to arithmetic errors and hallucination. In contrast, tool-based methods enable precise table manipulation but rely on rigid schemas and lack semantic understanding. These complementary drawbacks highlight the need for approaches that integrate robust reasoning with reliable table processing. In this work, we propose Mixture-of-Minds, a multi-agent framework that decomposes table reasoning into three specialized roles: planning, coding, and answering. This design enables each agent to focus on a specific aspect of the task while leveraging code execution for precise table manipulation. Building on this workflow, we introduce a self-improvement training framework that employs Monte Carlo Tree Search (MCTS) rollouts to generate pseudo-gold trajectories and optimize agents with reinforcement learning (RL). Extensive experiments show that Mixture-of-Minds delivers substantial gains, reaching 62.13% on TableBench and surpassing OpenAI-o4-mini-high. These results demonstrate the promise of combining structured multi-agent workflows with RL to advance table understanding.