TreeRare: Syntax Tree-Guided Retrieval and Reasoning for Knowledge-Intensive Question Answering

📅 2025-05-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from performance bottlenecks in complex, knowledge-intensive question answering (QA) due to erroneous reasoning and inaccurate retrieval. To address this, we propose a syntax-tree-driven hierarchical retrieval–reasoning framework. Our method traverses syntactic tree nodes bottom-up, generating fine-grained queries for each constituent, retrieving corresponding evidence, modeling local uncertainty, and aggregating multi-granular contextual answers. Unlike conventional iterative or flat retrieval paradigms, ours is the first to strictly align question decomposition, evidence acquisition, and linguistic structure—thereby mitigating error propagation and retrieval misalignment. Evaluated on five challenging benchmarks featuring ambiguity and multi-hop reasoning, our approach achieves significant improvements over state-of-the-art methods. Results demonstrate that syntactic structure guidance is critical for enhancing robustness and accuracy in knowledge-intensive QA.

Technology Category

Application Category

📝 Abstract
In real practice, questions are typically complex and knowledge-intensive, requiring Large Language Models (LLMs) to recognize the multifaceted nature of the question and reason across multiple information sources. Iterative and adaptive retrieval, where LLMs decide when and what to retrieve based on their reasoning, has been shown to be a promising approach to resolve complex, knowledge-intensive questions. However, the performance of such retrieval frameworks is limited by the accumulation of reasoning errors and misaligned retrieval results. To overcome these limitations, we propose TreeRare (Syntax Tree-Guided Retrieval and Reasoning), a framework that utilizes syntax trees to guide information retrieval and reasoning for question answering. Following the principle of compositionality, TreeRare traverses the syntax tree in a bottom-up fashion, and in each node, it generates subcomponent-based queries and retrieves relevant passages to resolve localized uncertainty. A subcomponent question answering module then synthesizes these passages into concise, context-aware evidence. Finally, TreeRare aggregates the evidence across the tree to form a final answer. Experiments across five question answering datasets involving ambiguous or multi-hop reasoning demonstrate that TreeRare achieves substantial improvements over existing state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

Resolving complex knowledge-intensive questions with syntax trees
Overcoming reasoning errors in iterative retrieval frameworks
Improving multi-hop reasoning accuracy in question answering
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses syntax trees to guide retrieval and reasoning
Generates subcomponent queries for localized uncertainty
Aggregates evidence across tree for final answer
🔎 Similar Papers
No similar papers found.
B
Boyi Zhang
University of Rochester
Z
Zhuo Liu
University of Rochester
Hangfeng He
Hangfeng He
University of Rochester
Natural Language ProcessingMachine Learning