Reasoning by Commented Code for Table Question Answering

πŸ“… 2026-01-31
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the challenge that large language models often disrupt the two-dimensional structure of tables during linearization, leading to inaccurate numerical reasoning and limited interpretability. To mitigate this, the authors propose a multi-step code generation framework that, for the first time, incorporates explicit natural language annotations during program synthesis. By decomposing table-based question answering into annotated, executable Python programs, the approach enhances the model’s understanding of tabular structure, improves numerical accuracy, and increases reasoning transparency. Built upon Qwen2.5-Coder-7B-Instruct and integrating multi-line annotated program synthesis with a lightweight answer selection mechanism, the method achieves 70.9% accuracy on WikiTableQuestions. When further combined with an end-to-end model, performance improves to 84.3%, substantially outperforming the Repanda baseline (67.6%).

Technology Category

Application Category

πŸ“ Abstract
Table Question Answering (TableQA) poses a significant challenge for large language models (LLMs) because conventional linearization of tables often disrupts the two-dimensional relationships intrinsic to structured data. Existing methods, which depend on end-to-end answer generation or single-line program queries, typically exhibit limited numerical accuracy and reduced interpretability. This work introduces a commented, step-by-step code-generation framework that incorporates explicit reasoning into the Python program-generation process. The approach decomposes TableQA reasoning into multi-line executable programs with concise natural language comments, thereby promoting clearer reasoning and increasing the likelihood of generating correct code. On the WikiTableQuestions benchmark, the proposed method achieves 70.9\% accuracy using Qwen2.5-Coder-7B-Instruct, surpassing the Repanda baseline (67.6\%). Integrating the proposed framework with a robust end-to-end TableQA model via a lightweight answer-selection mechanism yields further improvements. This combined approach achieves up to 84.3\% accuracy on the WikiTableQuestions benchmark.
Problem

Research questions and friction points this paper is trying to address.

Table Question Answering
Large Language Models
Structured Data Reasoning
Numerical Accuracy
Interpretability
Innovation

Methods, ideas, or system contributions that make the work stand out.

commented code generation
table question answering
step-by-step reasoning
executable program synthesis
interpretable LLM
πŸ”Ž Similar Papers
No similar papers found.
S
Seho Pyo
Department of Data Science, Seoul National University, Seoul, Republic of Korea
J
Jiheon Seok
Department of Data Science, Seoul National University, Seoul, Republic of Korea
Jaejin Lee
Jaejin Lee
Dept. of Compter Science and Engineering, Seoul National University
Parallel processingCompilersComputer architecturesOperating systemsHeterogeneous computing