🤖 AI Summary
Existing table reasoning methods are constrained by fixed logical templates and closed operation sets, limiting their ability to emulate the flexible cognitive processes of human analysts. To address this, we propose PoTable—a novel framework that introduces a human-inspired, multi-stage logical decomposition mechanism. PoTable constructs an open, unconstrained operation pool, enabling LLM-driven multi-step planning coordinated with real-time Python execution. It achieves high interpretability and strong executability through dynamic symbolic tool invocation, annotated standardized code generation, and an execution-feedback闭环. Evaluated on three public benchmarks, GPT-based PoTable achieves over a 4-percentage-point absolute accuracy improvement over the strongest baseline, significantly outperforming existing approaches.
📝 Abstract
Table-based reasoning has garnered substantial research interest, particularly in its integration with Large Language Model (LLM) which has revolutionized the general reasoning paradigm. Numerous LLM-based studies introduce symbolic tools (e.g., databases, Python) as assistants to extend human-like abilities in structured table understanding and complex arithmetic computations. However, these studies can be improved better in simulating human cognitive behavior when using symbolic tools, as they still suffer from limitations of non-standard logical splits and constrained operation pools. In this study, we propose PoTable as a novel table-based reasoning method that simulates a human tabular analyst, which integrates a Python interpreter as the real-time executor accompanied by an LLM-based operation planner and code generator. Specifically, PoTable follows a human-like logical stage split and extends the operation pool into an open-world space without any constraints. Through planning and executing in each distinct stage, PoTable standardly completes the entire reasoning process and produces superior reasoning results along with highly accurate, steply commented and completely executable programs. Accordingly, the effectiveness and explainability of PoTable are fully demonstrated. Extensive experiments over three evaluation datasets from two public benchmarks on two backbones show the outstanding performance of our approach. In particular, GPT-based PoTable achieves over 4% higher absolute accuracy than runner-ups on all evaluation datasets.