RAG without Forgetting: Continual Query-Infused Key Memory

📅 2026-02-05

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

This work addresses a critical limitation in current retrieval-augmented generation (RAG) systems: their inability to persist query-time adaptations, leading to redundant computation and a lack of cumulative learning, while index-side updates often induce semantic drift and noise accumulation. To overcome this, the authors propose Evolving Retrieval Memory (ERM), a training-free continual learning framework that transforms transient gains into persistent key evolution through correctness-gated feedback and atomic-level signal attribution. ERM establishes, for the first time, theoretical equivalence between query expansion and key expansion under standard similarity functions, introduces a norm-bounded stable update mechanism, and proves convergence under selective updating—enabling online index optimization with zero inference overhead. Evaluated across 13 domains from BEIR and BRIGHT benchmarks, ERM significantly enhances both retrieval and generation performance, particularly excelling in reasoning-intensive tasks while preserving native retrieval speed.

Technology Category

Application Category

📝 Abstract

Retrieval-augmented generation (RAG) systems commonly improve robustness via query-time adaptations such as query expansion and iterative retrieval. While effective, these approaches are inherently stateless: adaptations are recomputed for each query and discarded thereafter, precluding cumulative learning and repeatedly incurring inference-time cost. Index-side approaches like key expansion introduce persistence but rely on offline preprocessing or heuristic updates that are weakly aligned with downstream task utility, leading to semantic drift and noise accumulation. We propose Evolving Retrieval Memory (ERM), a training-free framework that transforms transient query-time gains into persistent retrieval improvements. ERM updates the retrieval index through correctness-gated feedback, selectively attributes atomic expansion signals to the document keys they benefit, and progressively evolves keys via stable, norm-bounded updates. We show that query and key expansion are theoretically equivalent under standard similarity functions and prove convergence of ERM's selective updates, amortizing optimal query expansion into a stable index with zero inference-time overhead. Experiments on BEIR and BRIGHT across 13 domains demonstrate consistent gains in retrieval and generation, particularly on reasoning-intensive tasks, at native retrieval speed.

Problem

Research questions and friction points this paper is trying to address.

Retrieval-Augmented Generation

Continual Learning

Query Expansion

Key Expansion

Semantic Drift

Innovation

Methods, ideas, or system contributions that make the work stand out.

Retrieval-Augmented Generation

Continual Learning

Key Expansion