Exploring Multi-Table Retrieval Through Iterative Search

📅 2025-11-17

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

This work addresses the challenge of cross-table retrieval and information composition in open-domain question answering. We propose an iterative multi-table retrieval framework that jointly optimizes semantic relevance, query coverage, and structural connectability. To our knowledge, this is the first approach to formulate multi-table retrieval as a greedy iterative search process, incorporating a lightweight joint-aware algorithm that dynamically evaluates semantic matching, coverage completeness, and inter-table joinability at each step. Evaluated on five mainstream NL2SQL benchmarks, our method achieves retrieval accuracy comparable to exact MIP solvers while accelerating inference by 4–400×, significantly outperforming conventional single-objective heuristic methods. Our core contributions are: (i) the first iterative retrieval paradigm that jointly optimizes semantic relevance, coverage, and structural connectability; and (ii) an efficient, interpretable, and scalable solution for multi-table joint retrieval.

Technology Category

Application Category

📝 Abstract

Open-domain question answering over datalakes requires retrieving and composing information from multiple tables, a challenging subtask that demands semantic relevance and structural coherence (e.g., joinability). While exact optimization methods like Mixed-Integer Programming (MIP) can ensure coherence, their computational complexity is often prohibitive. Conversely, simpler greedy heuristics that optimize for query coverage alone often fail to find these coherent, joinable sets. This paper frames multi-table retrieval as an iterative search process, arguing this approach offers advantages in scalability, interpretability, and flexibility. We propose a general framework and a concrete instantiation: a fast, effective Greedy Join-Aware Retrieval algorithm that holistically balances relevance, coverage, and joinability. Experiments across 5 NL2SQL benchmarks demonstrate that our iterative method achieves competitive retrieval performance compared to the MIP-based approach while being 4-400x faster depending on the benchmark and search space settings. This work highlights the potential of iterative heuristics for practical, scalable, and composition-aware retrieval.

Problem

Research questions and friction points this paper is trying to address.

Retrieving semantically relevant and structurally coherent multi-table data

Balancing computational complexity with joinability optimization in table retrieval

Developing scalable iterative methods for multi-table question answering systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative search process for multi-table retrieval

Greedy Join-Aware Retrieval algorithm balancing criteria

Achieves competitive performance with significant speed improvement

🔎 Similar Papers

Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval