Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning

📅 2025-06-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the weak generalization and poor robustness of supervised fine-tuning in tabular reasoning tasks—including table-based question answering, fact verification, and text-to-SQL—we propose the first reinforcement learning (RL)-based unified framework for tabular reasoning, introducing Proximal Policy Optimization (PPO) to this domain. Methodologically, we design a lightweight, rule-driven, structure-aware reward mechanism that integrates table structure preprocessing with multi-task prompt engineering, enabling joint training across tasks and emergent capability transfer. Evaluated on multiple benchmarks—including BIRD and WikiSQL—our approach achieves state-of-the-art performance: a 7B-parameter model attains 68.3% text-to-SQL accuracy on the BIRD dev set, outperforming Claude-3.7-Sonnet by 4.0% in overall performance. The framework significantly enhances model generalization, robustness to distributional shifts, and cross-task transferability.

Technology Category

Application Category

📝 Abstract
Table reasoning, encompassing tasks such as table question answering, fact verification, and text-to-SQL, requires precise understanding of structured tabular data, coupled with numerical computation and code manipulation for effective inference. Supervised fine-tuning (SFT) approaches have achieved notable success but often struggle with generalization and robustness due to biases inherent in imitative learning. We introduce Reasoning-Table, the first application of reinforcement learning (RL) to table reasoning, achieving state-of-the-art performance. Through rigorous data preprocessing, reward design, and tailored training strategies, our method leverages simple rule-based outcome rewards to outperform SFT across multiple benchmarks. Unified training across diverse tasks enables Reasoning-Table to emerge as a robust table reasoning large language model, surpassing larger proprietary models like Claude-3.7-Sonnet by 4.0% on table reasoning benchmarks. The approach also achieves excellent performance on text-to-SQL tasks, reaching 68.3% performance on the BIRD dev dataset with a 7B model. Further experiments demonstrate that Reasoning-Table enhances the model's generalization capabilities and robustness.
Problem

Research questions and friction points this paper is trying to address.

Applying reinforcement learning to improve table reasoning tasks
Overcoming generalization limits in supervised fine-tuning methods
Enhancing robustness and performance across diverse table-related benchmarks
Innovation

Methods, ideas, or system contributions that make the work stand out.

First RL application for table reasoning
Simple rule-based outcome rewards design
Unified training across diverse table tasks
🔎 Similar Papers
No similar papers found.
Fangyu Lei
Fangyu Lei
Institute of Automation, Chinese Academy of Sciences
LLM-AgentCode GenerationText-to-SQLTable Reasoning
Jinxiang Meng
Jinxiang Meng
Nanjing University of Posts and Telecommunications
LLM AgentTable ReasoningTool Use
Y
Yiming Huang
T
Tinghong Chen
University of Chinese Academy of Sciences
Y
Yun Zhang
Institute of Automation, CAS
S
Shizhu He
Institute of Automation, CAS; University of Chinese Academy of Sciences
J
Jun Zhao
Institute of Automation, CAS
K
Kang Liu
Institute of Automation, CAS