🤖 AI Summary
This work addresses the challenge of accurately routing natural language queries in multi-database enterprise environments, where overlapping domains and semantic ambiguity often hinder precise query interpretation. To tackle this issue, the study introduces a structured reasoning mechanism into the query routing task for the first time, proposing a modular, reasoning-driven reranking strategy. This approach explicitly models schema coverage, structural connectivity, and fine-grained semantic alignment, effectively integrating the strengths of retrieval-based methods and logical inference. Evaluated on a benchmark dataset constructed from real-world scenarios, the proposed method significantly outperforms baseline approaches—including pure embedding-based models and direct prompting of large language models—across all metrics, demonstrating marked improvements in both routing accuracy and robustness.
📝 Abstract
We address the task of routing natural language queries in multi-database enterprise environments. We construct realistic benchmarks by extending existing NL-to-SQL datasets. Our study shows that routing becomes increasingly challenging with larger, domain-overlapping DB repositories and ambiguous queries, motivating the need for more structured and robust reasoning-based solutions. By explicitly modelling schema coverage, structural connectivity, and fine-grained semantic alignment, the proposed modular, reasoning-driven reranking strategy consistently outperforms embedding-only and direct LLM-prompting baselines across all the metrics.