Toward Real-World Table Agents: Capabilities, Workflows, and Design Principles for LLM-based Table Intelligence

📅 2025-07-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Real-world tabular tasks face challenges including data noise, structural heterogeneity, and semantic complexity—issues underrepresented in existing studies that rely predominantly on clean, academic benchmarks. To bridge this gap, we propose Table Agent, a practical, LLM-centric intelligent agent framework for end-to-end tabular workflow automation, integrating preprocessing, reasoning, and domain adaptation. Our approach introduces five core capabilities: structural understanding, semantic alignment, retrieval-aware compression, traceable reasoning, and cross-domain generalization. We systematically quantify the performance gap between academic benchmarks and real-world applications. Leveraging Text-to-SQL agent techniques, we unify table parsing, semantic matching, query generation, and execution tracing—substantially improving robustness, generalization, and efficiency. This work establishes a reproducible design paradigm and optimization roadmap for open-source, LLM-driven tabular agents.

Technology Category

Application Category

📝 Abstract
Tables are fundamental in domains such as finance, healthcare, and public administration, yet real-world table tasks often involve noise, structural heterogeneity, and semantic complexity--issues underexplored in existing research that primarily targets clean academic datasets. This survey focuses on LLM-based Table Agents, which aim to automate table-centric workflows by integrating preprocessing, reasoning, and domain adaptation. We define five core competencies--C1: Table Structure Understanding, C2: Table and Query Semantic Understanding, C3: Table Retrieval and Compression, C4: Executable Reasoning with Traceability, and C5: Cross-Domain Generalization--to analyze and compare current approaches. In addition, a detailed examination of the Text-to-SQL Agent reveals a performance gap between academic benchmarks and real-world scenarios, especially for open-source models. Finally, we provide actionable insights to improve the robustness, generalization, and efficiency of LLM-based Table Agents in practical settings.
Problem

Research questions and friction points this paper is trying to address.

Addressing noise and complexity in real-world table tasks
Automating table workflows with preprocessing and reasoning
Bridging performance gaps in Text-to-SQL for real-world use
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-based Table Agents automate workflows
Five core competencies for table tasks
Improving robustness in real-world scenarios
🔎 Similar Papers
No similar papers found.
J
Jiaming Tian
College of Computer Science and Technology, Zhejiang University, Hangzhou 310007, China
Liyao Li
Liyao Li
PhD Candidate, Zhejiang University
Table ReasoningLarge Tabular Language ModelMachine Learning
Wentao Ye
Wentao Ye
Zhejiang University, Ant Research
LLMsMachine LearningMultimodality
Haobo Wang
Haobo Wang
Zhejiang University
Machine Learning
L
Lingxin Wang
College of Computer Science and Technology, Zhejiang University, Hangzhou 310007, China
L
Lihua Yu
Bank Of HangZhou, Hangzhou 310016, China
Zujie Ren
Zujie Ren
Hangzhou Dianzi University
Big DataWorkloadMeasurementBenchmarking
G
Gang Chen
College of Computer Science and Technology, Zhejiang University, Hangzhou 310007, China
J
Junbo Zhao
College of Computer Science and Technology, Zhejiang University, Hangzhou 310007, China