🤖 AI Summary
Traditional RAG systems often suffer from contextual redundancy, low information density, and fragile reasoning in multi-hop question answering due to unstructured retrieval and single-pass generation. This work proposes a structured reasoning framework that eschews explicit graph construction by representing queries and documents as relational triples. It employs a lightweight two-stage classification mechanism to constrain entity semantics, decomposes complex questions into ordered sub-queries, and performs stepwise evidence selection by jointly leveraging semantic similarity and structural consistency. An explicit entity binding table is introduced to resolve intermediate variables and disambiguate entities. The approach outperforms strong baselines by up to 14% across multiple multi-hop QA benchmarks while yielding more interpretable evidence tracing and trustworthy reasoning trajectories.
📝 Abstract
Retrieval-Augmented Generation (RAG) helps large language models (LLMs) answer knowledge-intensive and time-sensitive questions by conditioning generation on external evidence. However, most RAG systems still retrieve unstructured chunks and rely on one-shot generation, which often yields redundant context, low information density, and brittle multi-hop reasoning. While structured RAG pipelines can improve grounding, they typically require costly and error-prone graph construction or impose rigid entity-centric structures that do not align with the query's reasoning chain.
We propose \textsc{TaSR-RAG}, a taxonomy-guided structured reasoning framework for evidence selection. We represent both queries and documents as relational triples, and constrain entity semantics with a lightweight two-level taxonomy to balance generalization and precision. Given a complex question, \textsc{TaSR-RAG} decomposes it into an ordered sequence of triple sub-queries with explicit latent variables, then performs step-wise evidence selection via hybrid triple matching that combines semantic similarity over raw triples with structural consistency over typed triples.
By maintaining an explicit entity binding table across steps, \textsc{TaSR-RAG} resolves intermediate variables and reduces entity conflation without explicit graph construction or exhaustive search. Experiments on multiple multi-hop question answering benchmarks show that \textsc{TaSR-RAG} consistently outperforms strong RAG and structured-RAG baselines by up to 14\%, while producing clearer evidence attribution and more faithful reasoning traces.