Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning

📅 2025-06-02

📈 Citations: 0

✨ Influential: 0

career value

169K/year

🤖 AI Summary

To address the weak generalization and poor robustness of supervised fine-tuning in tabular reasoning tasks—including table-based question answering, fact verification, and text-to-SQL—we propose the first reinforcement learning (RL)-based unified framework for tabular reasoning, introducing Proximal Policy Optimization (PPO) to this domain. Methodologically, we design a lightweight, rule-driven, structure-aware reward mechanism that integrates table structure preprocessing with multi-task prompt engineering, enabling joint training across tasks and emergent capability transfer. Evaluated on multiple benchmarks—including BIRD and WikiSQL—our approach achieves state-of-the-art performance: a 7B-parameter model attains 68.3% text-to-SQL accuracy on the BIRD dev set, outperforming Claude-3.7-Sonnet by 4.0% in overall performance. The framework significantly enhances model generalization, robustness to distributional shifts, and cross-task transferability.

Technology Category

Application Category

📝 Abstract

Table reasoning, encompassing tasks such as table question answering, fact verification, and text-to-SQL, requires precise understanding of structured tabular data, coupled with numerical computation and code manipulation for effective inference. Supervised fine-tuning (SFT) approaches have achieved notable success but often struggle with generalization and robustness due to biases inherent in imitative learning. We introduce Reasoning-Table, the first application of reinforcement learning (RL) to table reasoning, achieving state-of-the-art performance. Through rigorous data preprocessing, reward design, and tailored training strategies, our method leverages simple rule-based outcome rewards to outperform SFT across multiple benchmarks. Unified training across diverse tasks enables Reasoning-Table to emerge as a robust table reasoning large language model, surpassing larger proprietary models like Claude-3.7-Sonnet by 4.0% on table reasoning benchmarks. The approach also achieves excellent performance on text-to-SQL tasks, reaching 68.3% performance on the BIRD dev dataset with a 7B model. Further experiments demonstrate that Reasoning-Table enhances the model's generalization capabilities and robustness.

Problem

Research questions and friction points this paper is trying to address.

Applying reinforcement learning to improve table reasoning tasks

Overcoming generalization limits in supervised fine-tuning methods

Enhancing robustness and performance across diverse table-related benchmarks

Innovation

Methods, ideas, or system contributions that make the work stand out.

First RL application for table reasoning

Simple rule-based outcome rewards design

Unified training across diverse table tasks

🔎 Similar Papers

No similar papers found.