Plugging Schema Graph into Multi-Table QA: A Human-Guided Framework for Reducing LLM Reliance

📅 2025-06-04
📈 Citations: 0
Influential: 0
📄 PDF

career value

168K/year
🤖 AI Summary
Schema linking in multi-table question answering (QA) is unreliable on real-world, complex tables due to ambiguous or heterogeneous schema structures. Method: This paper proposes a human-validated schema graph modeling approach: (1) constructing a structured schema graph grounded in domain knowledge; (2) performing natural language query-guided graph traversal to generate interpretable reasoning chains; and (3) applying subpath merging and pruning strategies to enhance cross-table reasoning efficiency and logical coherence. Contribution/Results: To our knowledge, this is the first work to successfully deploy human-guided schema graphs in industrial-scale multi-table QA, substantially reducing reliance on large language models (LLMs). Extensive experiments on standard benchmarks and a large-scale real-world industrial dataset demonstrate that our method consistently outperforms state-of-the-art approaches—achieving robust and effective performance on complex, heterogeneous tabular data with diverse column semantics.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have shown promise in table Question Answering (Table QA). However, extending these capabilities to multi-table QA remains challenging due to unreliable schema linking across complex tables. Existing methods based on semantic similarity work well only on simplified hand-crafted datasets and struggle to handle complex, real-world scenarios with numerous and diverse columns. To address this, we propose a graph-based framework that leverages human-curated relational knowledge to explicitly encode schema links and join paths. Given a natural language query, our method searches this graph to construct interpretable reasoning chains, aided by pruning and sub-path merging strategies to enhance efficiency and coherence. Experiments on both standard benchmarks and a realistic, large-scale dataset demonstrate the effectiveness of our approach. To our knowledge, this is the first multi-table QA system applied to truly complex industrial tabular data.
Problem

Research questions and friction points this paper is trying to address.

Challenges in multi-table QA due to unreliable schema linking
Existing methods fail on complex real-world tabular data
Proposing a human-guided graph framework for schema linking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph-based framework encodes human-curated schema links
Searches graph to construct interpretable reasoning chains
Pruning and sub-path merging enhance efficiency