RePanda: Pandas-powered Tabular Verification and Reasoning

📅 2025-03-14

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

To address the poor interpretability and unverifiable reasoning of black-box models in tabular fact verification, this paper proposes the first explainable verification paradigm based on pandas query generation—automatically compiling natural-language claims into executable, verifiable structured queries. Key contributions include: (1) constructing two high-quality, query-claim aligned datasets, PanTabFact and PanWiki; (2) effectively distilling structured reasoning capabilities from a 671B model into a lightweight 7B model (a fine-tuned variant of DeepSeek-Coder-7B-Instruct); and (3) designing a DeepSeek-Chat–driven query synthesis framework with automated error correction. Experiments show state-of-the-art accuracy of 84.09% on TabFact and 84.72% in zero-shot transfer to WikiFact—significantly outperforming all baselines—along with a 75.1% direct answer accuracy on table QA. All code and datasets are publicly released.

Technology Category

Application Category

📝 Abstract

Fact-checking tabular data is essential for ensuring the accuracy of structured information. However, existing methods often rely on black-box models with opaque reasoning. We introduce RePanda, a structured fact verification approach that translates claims into executable pandas queries, enabling interpretable and verifiable reasoning. To train RePanda, we construct PanTabFact, a structured dataset derived from the TabFact train set, where claims are paired with executable queries generated using DeepSeek-Chat and refined through automated error correction. Fine-tuning DeepSeek-coder-7B-instruct-v1.5 on PanTabFact, RePanda achieves 84.09% accuracy on the TabFact test set. To evaluate Out-of-Distribution (OOD) generalization, we interpret question-answer pairs from WikiTableQuestions as factual claims and refer to this dataset as WikiFact. Without additional fine-tuning, RePanda achieves 84.72% accuracy on WikiFact, significantly outperforming all other baselines and demonstrating strong OOD robustness. Notably, these results closely match the zero-shot performance of DeepSeek-Chat (671B), indicating that our fine-tuning approach effectively distills structured reasoning from a much larger model into a compact, locally executable 7B model. Beyond fact verification, RePanda extends to tabular question answering by generating executable queries that retrieve precise answers. To support this, we introduce PanWiki, a dataset mapping WikiTableQuestions to pandas queries. Fine-tuning on PanWiki, RePanda achieves 75.1% accuracy in direct answer retrieval. These results highlight the effectiveness of structured execution-based reasoning for tabular verification and question answering. We have publicly released the dataset on Hugging Face at datasets/AtoosaChegini/PanTabFact.

Problem

Research questions and friction points this paper is trying to address.

Develops RePanda for interpretable tabular data verification.

Creates PanTabFact dataset for training structured reasoning models.

Enhances OOD generalization in tabular question answering.

Innovation

Methods, ideas, or system contributions that make the work stand out.

RePanda uses pandas queries for interpretable fact verification.

Fine-tunes DeepSeek-coder-7B for structured reasoning on PanTabFact.

Achieves high accuracy on OOD datasets without additional fine-tuning.

🔎 Similar Papers

No similar papers found.