R^2-Mem: Reflective Experience for Memory Search

📅 2026-05-13
📈 Citations: 0
Influential: 0
📄 PDF

career value

245K/year
🤖 AI Summary
Existing memory-augmented search agents struggle to effectively learn from historical trajectories of varying quality, often repeating past mistakes. This work proposes a reinforcement learning–free, reflective experience framework that distills abstract experiential knowledge during an offline phase through a rubric-guided evaluator and a self-reflection mechanism. During online inference, the agent leverages retrieved experiences alongside a guidance module to refine its search behavior. The approach substantially enhances performance, achieving up to a 22.6% improvement in F1 score while simultaneously reducing token consumption by 12.9% and the number of search iterations by 20.2%. These results demonstrate a cost-effective, self-improving capability for memory-based search without reliance on online learning or reward signals.
📝 Abstract
Deep search has recently emerged as a promising paradigm for enabling agents to retrieve fine-grained historical information without heavy memory pre-managed. However, existing deep search agents for memory system repeat past error behaviors because they fail to learn from the prior high- and low-quality search trajectories. To address this limitation, we propose R^2-Mem, a reflective experience framework for memory search systems. In the offline stage, a Rubric-guided Evaluator scores low- and high-quality steps in historical trajectories, and a self-Reflection Learner distills the corresponding abstract experience. During the online inference, the retrieved experience will guide future search actions to avoid repeated mistakes and maintain high-quality behaviors. Extensive experiments demonstrate that R^2-Mem consistently improves both effectiveness and efficiency over strong baselines, improving F1 scores by up to 22.6%, while reducing token consumption by 12.9% and search iterations by 20.2%. These results verify that R^2-Mem provides a RL-free and low-cost solution for self-improving LLM agents.
Problem

Research questions and friction points this paper is trying to address.

memory search
deep search
experience learning
error repetition
LLM agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

reflective experience
memory search
trajectory evaluation
self-improving agents
RL-free learning