MemR$^3$: Memory Retrieval via Reflective Reasoning for LLM Agents

📅 2025-12-23
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing memory systems for LLM agents emphasize compressed storage while neglecting explicit, closed-loop control over the retrieval process. Method: We propose an autonomous memory retrieval framework for LLM agents that replaces the conventional “retrieve-then-answer” paradigm with a tripartite action routing mechanism—“retrieve-reflect-answer”—and introduces an explicit Evidence Gap Tracker to model and dynamically regulate global evidence states, ensuring interpretable and precise retrieval. The framework is plug-and-play compatible with mainstream memory backends and integrates LLM-driven autonomous routing, closed-loop feedback control, and RAG-enhanced architecture. Contribution/Results: Evaluated on the LoCoMo benchmark using LLM-as-a-Judge evaluation, our method achieves significant gains over strong baselines: +7.29% over standard RAG and +1.94% over Zep (with GPT-4.1-mini as the judge backend), demonstrating superior retrieval fidelity and reasoning coherence.

Technology Category

Application Category

📝 Abstract
Memory systems have been designed to leverage past experiences in Large Language Model (LLM) agents. However, many deployed memory systems primarily optimize compression and storage, with comparatively less emphasis on explicit, closed-loop control of memory retrieval. From this observation, we build memory retrieval as an autonomous, accurate, and compatible agent system, named MemR$^3$, which has two core mechanisms: 1) a router that selects among retrieve, reflect, and answer actions to optimize answer quality; 2) a global evidence-gap tracker that explicitly renders the answering process transparent and tracks the evidence collection process. This design departs from the standard retrieve-then-answer pipeline by introducing a closed-loop control mechanism that enables autonomous decision-making. Empirical results on the LoCoMo benchmark demonstrate that MemR$^3$ surpasses strong baselines on LLM-as-a-Judge score, and particularly, it improves existing retrievers across four categories with an overall improvement on RAG (+7.29%) and Zep (+1.94%) using GPT-4.1-mini backend, offering a plug-and-play controller for existing memory stores.
Problem

Research questions and friction points this paper is trying to address.

Enhances memory retrieval control in LLM agents
Introduces closed-loop reasoning for autonomous decision-making
Improves answer quality via reflective evidence tracking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Closed-loop control for autonomous memory retrieval
Router selects among retrieve, reflect, answer actions
Global evidence-gap tracker renders answering process transparent
🔎 Similar Papers
No similar papers found.