Improving Context Fidelity via Native Retrieval-Augmented Reasoning

📅 2025-09-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) suffer from low context fidelity and answer inconsistency when answering questions grounded in given contexts. To address this, we propose Native Retrieval-Augmented Reasoning (NRAR), the first framework enabling LLMs to autonomously and dynamically retrieve and integrate critical in-context tokens directly within their reasoning chains—without external retrievers or large-scale annotated data. NRAR integrates context-aware prompting, lightweight evidence supervision, and a strategic internal token retrieval mechanism. Evaluated on multiple real-world and counterfactual question-answering benchmarks, NRAR significantly outperforms supervised fine-tuning, conventional retrieval-augmented generation (RAG), and external retrieval methods. It improves retrieval accuracy while enhancing answer consistency and reliability. NRAR establishes an efficient, scalable paradigm for context-faithful reasoning in LLMs.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) often struggle with context fidelity, producing inconsistent answers when responding to questions based on provided information. Existing approaches either rely on expensive supervised fine-tuning to generate evidence post-answer or train models to perform web searches without necessarily improving utilization of the given context. We propose CARE, a novel native retrieval-augmented reasoning framework that teaches LLMs to explicitly integrate in-context evidence within their reasoning process with the model's own retrieval capabilities. Our method requires limited labeled evidence data while significantly enhancing both retrieval accuracy and answer generation performance through strategically retrieved in-context tokens in the reasoning chain. Extensive experiments on multiple real-world and counterfactual QA benchmarks demonstrate that our approach substantially outperforms supervised fine-tuning, traditional retrieval-augmented generation methods, and external retrieval solutions. This work represents a fundamental advancement in making LLMs more accurate, reliable, and efficient for knowledge-intensive tasks.
Problem

Research questions and friction points this paper is trying to address.

Improving context fidelity in LLMs
Enhancing retrieval-augmented reasoning capabilities
Reducing inconsistency in evidence-based answers
Innovation

Methods, ideas, or system contributions that make the work stand out.

Native retrieval-augmented reasoning framework
Explicit in-context evidence integration
Strategic token retrieval for reasoning
🔎 Similar Papers
No similar papers found.