RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation

📅 2025-02-16

📈 Citations: 0

✨ Influential: 0

career value

168K/year

🤖 AI Summary

To address inefficiencies in retrieval, unstructured knowledge integration, and single-step reasoning limitations of retrieval-augmented language models (RALMs) for knowledge-intensive tasks, this paper proposes an iterative retrieval and dynamic knowledge graph–driven structured reasoning framework. Methodologically, it introduces: (1) topic-constrained retrieval to improve relevance; (2) action-planning–guided subquery generation for fine-grained reasoning control; (3) dynamic text-to-knowledge-graph conversion to construct query-aware, structured knowledge representations; and (4) graph neural network–enhanced answer generation. Evaluated on seven knowledge-intensive generative benchmarks, the framework achieves state-of-the-art performance—improving open-source and closed-source base models by 6.4% and 7.0%, respectively. Ablation studies comprehensively validate the effectiveness of each component.

Technology Category

Application Category

📝 Abstract

Retrieval-augmented language models often struggle with knowledge-intensive tasks due to inefficient retrieval, unstructured knowledge integration, and single-pass architectures. We present Retrieval-And-Structuring (RAS), a novel framework that dynamically constructs and reasons over query-specific knowledge graphs through iterative retrieval and structuring. RAS introduces four key technical innovations: (1) a themescoped retrieval mechanism that efficiently narrows the search space while maintaining retrieval quality, (2) an action planning module that determines knowledge needs and generates focused sub-queries, (3) a dynamic knowledge structuring approach that converts retrieved text into an evolving knowledge graph, and (4) a graph-augmented answering component that leverages the accumulated structured information. Our framework achieves state-of-the-art performance, surpassing leading baselines by 6.4% with open-source language models and 7.0% with proprietary models on seven knowledge-intensive generation datasets across all evaluation metrics. Detailed ablation studies verify the contribution of each technical component to the overall system performance.

Problem

Research questions and friction points this paper is trying to address.

Enhances knowledge-intensive tasks for language models

Improves retrieval and structuring of knowledge

Optimizes dynamic knowledge graph construction and reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Themescoped retrieval mechanism narrows search

Action planning module generates sub-queries

Dynamic knowledge structuring creates evolving graphs

🔎 Similar Papers

Large Language Model Enhanced Knowledge Representation Learning: A Survey