🤖 AI Summary
To address the poor interpretability and unverifiable reasoning of black-box models in tabular fact verification, this paper proposes the first explainable verification paradigm based on pandas query generation—automatically compiling natural-language claims into executable, verifiable structured queries. Key contributions include: (1) constructing two high-quality, query-claim aligned datasets, PanTabFact and PanWiki; (2) effectively distilling structured reasoning capabilities from a 671B model into a lightweight 7B model (a fine-tuned variant of DeepSeek-Coder-7B-Instruct); and (3) designing a DeepSeek-Chat–driven query synthesis framework with automated error correction. Experiments show state-of-the-art accuracy of 84.09% on TabFact and 84.72% in zero-shot transfer to WikiFact—significantly outperforming all baselines—along with a 75.1% direct answer accuracy on table QA. All code and datasets are publicly released.
📝 Abstract
Fact-checking tabular data is essential for ensuring the accuracy of structured information. However, existing methods often rely on black-box models with opaque reasoning. We introduce RePanda, a structured fact verification approach that translates claims into executable pandas queries, enabling interpretable and verifiable reasoning. To train RePanda, we construct PanTabFact, a structured dataset derived from the TabFact train set, where claims are paired with executable queries generated using DeepSeek-Chat and refined through automated error correction. Fine-tuning DeepSeek-coder-7B-instruct-v1.5 on PanTabFact, RePanda achieves 84.09% accuracy on the TabFact test set. To evaluate Out-of-Distribution (OOD) generalization, we interpret question-answer pairs from WikiTableQuestions as factual claims and refer to this dataset as WikiFact. Without additional fine-tuning, RePanda achieves 84.72% accuracy on WikiFact, significantly outperforming all other baselines and demonstrating strong OOD robustness. Notably, these results closely match the zero-shot performance of DeepSeek-Chat (671B), indicating that our fine-tuning approach effectively distills structured reasoning from a much larger model into a compact, locally executable 7B model. Beyond fact verification, RePanda extends to tabular question answering by generating executable queries that retrieve precise answers. To support this, we introduce PanWiki, a dataset mapping WikiTableQuestions to pandas queries. Fine-tuning on PanWiki, RePanda achieves 75.1% accuracy in direct answer retrieval. These results highlight the effectiveness of structured execution-based reasoning for tabular verification and question answering. We have publicly released the dataset on Hugging Face at datasets/AtoosaChegini/PanTabFact.