Mixture-of-Minds: Multi-Agent Reinforcement Learning for Table Understanding

📅 2025-10-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing table understanding methods suffer from a dichotomy: fine-tuning paradigms often yield arithmetic errors and hallucinations, while tool-calling approaches lack semantic comprehension and rely on rigid templates. To address this, we propose Mixture-of-Minds, a multi-agent architecture that decomposes the task into three specialized roles—planning, coding, and answering—enabling tight coordination between semantic understanding and precise execution via a code-execution feedback loop. We introduce a novel self-optimizing training mechanism based on Monte Carlo Tree Search (MCTS), dynamically integrating reasoning and execution. The framework is trained end-to-end by jointly leveraging large language models, multi-agent reinforcement learning, and an executable environment. On the TableBench benchmark, our method achieves 62.13% accuracy—significantly outperforming OpenAI-o4-mini-high—and demonstrates unified improvements in both reasoning robustness and execution reliability.

Technology Category

Application Category

📝 Abstract
Understanding and reasoning over tables is a critical capability for many real-world applications. Large language models (LLMs) have shown promise on this task, but current approaches remain limited. Fine-tuning based methods strengthen language reasoning; yet they are prone to arithmetic errors and hallucination. In contrast, tool-based methods enable precise table manipulation but rely on rigid schemas and lack semantic understanding. These complementary drawbacks highlight the need for approaches that integrate robust reasoning with reliable table processing. In this work, we propose Mixture-of-Minds, a multi-agent framework that decomposes table reasoning into three specialized roles: planning, coding, and answering. This design enables each agent to focus on a specific aspect of the task while leveraging code execution for precise table manipulation. Building on this workflow, we introduce a self-improvement training framework that employs Monte Carlo Tree Search (MCTS) rollouts to generate pseudo-gold trajectories and optimize agents with reinforcement learning (RL). Extensive experiments show that Mixture-of-Minds delivers substantial gains, reaching 62.13% on TableBench and surpassing OpenAI-o4-mini-high. These results demonstrate the promise of combining structured multi-agent workflows with RL to advance table understanding.
Problem

Research questions and friction points this paper is trying to address.

Addresses limitations in table understanding with LLMs
Integrates robust reasoning with reliable table processing
Overcomes arithmetic errors and schema rigidity in methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-agent framework with specialized planning, coding, answering roles
Self-improvement training using MCTS rollouts and reinforcement learning
Combining structured multi-agent workflows with code execution
🔎 Similar Papers
No similar papers found.