🤖 AI Summary
Existing RAG methods treat documents as flat, unstructured text chunks, neglecting inherent structural information—thereby limiting multi-document reasoning and knowledge integration. To address this, we propose RDR2, the first end-to-end document-structure-aware retrieval-augmented generation framework. RDR2 employs an LLM-driven trainable router that dynamically navigates an explicitly modeled document structure tree; integrates automated action construction, structure-aware paragraph selection, and hierarchical evidence assembly to jointly optimize content relevance and hierarchical coherence; and refines routing decisions via reinforcement learning–inspired policy optimization. Evaluated on five challenging multi-document reasoning benchmarks, RDR2 achieves significant improvements in answer accuracy and information synthesis quality. Our results demonstrate that explicit structural modeling delivers a critical performance gain for RAG systems, advancing robust, interpretable, and contextually grounded reasoning over structured document collections.
📝 Abstract
While large language models (LLMs) demonstrate impressive capabilities, their reliance on parametric knowledge often leads to factual inaccuracies. Retrieval-Augmented Generation (RAG) mitigates this by leveraging external documents, yet existing approaches treat retrieved passages as isolated chunks, ignoring valuable structure that is crucial for document organization. Motivated by this gap, we propose Retrieve-DocumentRoute-Read (RDR2), a novel framework that explicitly incorporates structural information throughout the RAG process. RDR2 employs an LLM-based router to dynamically navigate document structure trees, jointly evaluating content relevance and hierarchical relationships to assemble optimal evidence. Our key innovation lies in formulating document routing as a trainable task, with automatic action curation and structure-aware passage selection inspired by human reading strategies. Through comprehensive evaluation on five challenging datasets, RDR2 achieves state-of-the-art performance, demonstrating that explicit structural awareness significantly enhances RAG systems' ability to acquire and utilize knowledge, particularly in complex scenarios requiring multi-document synthesis.