TabSD: Large Free-Form Table Question Answering with SQL-Based Table Decomposition

📅 2025-02-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited understanding and reasoning capabilities of large language models (LLMs) in large-scale, schema-less, noisy free-form TableQA, this paper proposes an SQL-driven dynamic table decomposition framework. It leverages SQL generation to guide fine-grained subtable partitioning, integrates an SQL validator for structural consistency verification, and synergistically combines LLM-based semantic comprehension with programmatic structural constraints. We further introduce SLQA/SEQA—the first pure large-scale free-form TableQA benchmark—designed to rigorously evaluate robustness and generalization in unstructured tabular settings. Extensive experiments demonstrate that our method achieves absolute accuracy improvements of 23.07%, 2.84%, 23.24%, and 9.32% over the strongest baselines across four benchmarks, significantly enhancing noise robustness, interpretability, and cross-table generalization.

Technology Category

Application Category

📝 Abstract
Question answering on free-form tables (TableQA) is challenging due to the absence of predefined schemas and the presence of noise in large tables. While Large Language Models (LLMs) have shown promise in TableQA, they struggle with large free-form tables and noise sensitivity. To address these challenges, we propose TabSD, a SQL-based decomposition model that enhances LLMs' ability to process large free-form tables. TabSD generates SQL queries to guide the table decomposition, remove noise, and processes sub-tables for better answer generation. Additionally, SQL Verifier refines SQL outputs to enhance decomposition accuracy. We introduce two TableQA datasets with large free-form tables, SLQA and SEQA, which consist solely of large free-form tables and will be publicly available. Experimental results on four benchmark datasets demonstrate that TABSD outperforms the best-existing baseline models by 23.07%, 2.84%, 23.24% and 9.32% in accuracy, respectively, highlighting its effectiveness in handling large and noisy free-form tables.
Problem

Research questions and friction points this paper is trying to address.

Enhances LLMs' table processing
Reduces noise in free-form tables
Improves accuracy in TableQA
Innovation

Methods, ideas, or system contributions that make the work stand out.

SQL-based table decomposition model
SQL Verifier enhances accuracy
Handles large noisy free-form tables
🔎 Similar Papers
No similar papers found.