Structured Knowledge Representation through Contextual Pages for Retrieval-Augmented Generation

📅 2026-01-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of existing retrieval-augmented generation (RAG) methods, which often lack structured organization when iteratively accumulating knowledge, thereby compromising the coherence and completeness of knowledge representation. To overcome this, the authors propose PAGER, a novel framework that introduces a page-driven, structured knowledge representation mechanism. PAGER guides large language models to construct multi-slot cognitive outlines, iteratively retrieves relevant information, and fills each knowledge slot accordingly, ultimately producing a structured context page to inform answer generation. By integrating prompt engineering, iterative retrieval, and slot-filling techniques, PAGER enables autonomous and orderly knowledge integration. Experimental results demonstrate that PAGER significantly outperforms current RAG approaches across multiple knowledge-intensive benchmarks and backbone models, yielding responses with higher information density, stronger consistency, reduced knowledge conflicts, and improved efficiency in leveraging external knowledge.

Technology Category

Application Category

📝 Abstract
Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by incorporating external knowledge. Recently, some works have incorporated iterative knowledge accumulation processes into RAG models to progressively accumulate and refine query-related knowledge, thereby constructing more comprehensive knowledge representations. However, these iterative processes often lack a coherent organizational structure, which limits the construction of more comprehensive and cohesive knowledge representations. To address this, we propose PAGER, a page-driven autonomous knowledge representation framework for RAG. PAGER first prompts an LLM to construct a structured cognitive outline for a given question, which consists of multiple slots representing a distinct knowledge aspect. Then, PAGER iteratively retrieves and refines relevant documents to populate each slot, ultimately constructing a coherent page that serves as contextual input for guiding answer generation. Experiments on multiple knowledge-intensive benchmarks and backbone models show that PAGER consistently outperforms all RAG baselines. Further analyses demonstrate that PAGER constructs higher-quality and information-dense knowledge representations, better mitigates knowledge conflicts, and enables LLMs to leverage external knowledge more effectively. All code is available at https://github.com/OpenBMB/PAGER.
Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation
structured knowledge representation
knowledge organization
iterative knowledge accumulation
coherent knowledge representation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation
Structured Knowledge Representation
Contextual Pages
Iterative Knowledge Refinement
Cognitive Outline
🔎 Similar Papers
No similar papers found.