Tuning LLMs by RAG Principles: Towards LLM-native Memory

📅 2025-03-20

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses the challenge of enhancing memory capabilities in large language models (LLMs). We propose RAG-Tuned-LLM: the first approach to reformulate the retrieval-augmented generation (RAG) paradigm as a supervised fine-tuning objective. Leveraging RAG-inspired data synthesis, we perform end-to-end fine-tuning on medium- and small-scale LLMs (e.g., 7B parameters), endowing them with an intrinsic, efficient, and generalizable “LLM-native memory” capability. Our method seamlessly integrates long-context modeling with precise retrieval, jointly supporting global contextual understanding and targeted key-information retrieval. Evaluated on three newly constructed multi-scale memory benchmark datasets, RAG-Tuned-LLM significantly outperforms both long-context LLMs and conventional RAG baselines—particularly excelling on global-reasoning and keyword-based queries, where accuracy improvements are substantial.

Technology Category

Application Category

📝 Abstract

Memory, additional information beyond the training of large language models (LLMs), is crucial to various real-world applications, such as personal assistant. The two mainstream solutions to incorporate memory into the generation process are long-context LLMs and retrieval-augmented generation (RAG). In this paper, we first systematically compare these two types of solutions on three renovated/new datasets and show that (1) long-context solutions, although more expensive, shall be easier to capture the big picture and better answer queries which require considering the memory as a whole; and (2) when the queries concern specific information, RAG solutions shall be more competitive especially when the keywords can be explicitly matched. Therefore, we propose a novel method RAG-Tuned-LLM which fine-tunes a relative small (e.g., 7B) LLM using the data generated following the RAG principles, so it can combine the advantages of both solutions. Extensive experiments on three datasets demonstrate that RAG-Tuned-LLM can beat long-context LLMs and RAG methods across a wide range of query types.

Problem

Research questions and friction points this paper is trying to address.

Incorporating memory into LLMs for real-world applications

Comparing long-context LLMs and RAG for memory integration

Proposing RAG-Tuned-LLM to combine advantages of both methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tunes small LLMs using RAG principles

Combines long-context and RAG advantages

Outperforms long-context LLMs and RAG methods

🔎 Similar Papers

No similar papers found.