MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 6

✨ Influential: 1

career value

260K/year

🤖 AI Summary

This work addresses the challenge that large language models (LLMs) struggle to continually learn without weight updates, as existing memory-based approaches are often vulnerable to noise and lack mechanisms for active optimization. To overcome this, the authors propose MemRL, a framework that decouples the stable reasoning capabilities of a frozen LLM from a plastic episodic memory module. MemRL enables runtime self-evolution through non-parametric reinforcement learning, featuring a two-stage retrieval mechanism—semantic filtering followed by Q-value-based selection—and leverages environmental feedback to update Q-values online. Experiments on HLE, BigCodeBench, ALFWorld, and Lifelong Agent Bench demonstrate that MemRL significantly outperforms current methods, establishing its effectiveness in achieving efficient continual learning without fine-tuning.

Technology Category

Application Category

📝 Abstract

The hallmark of human intelligence is the self-evolving ability to master new skills by learning from past experiences. However, current AI agents struggle to emulate this self-evolution: fine-tuning is computationally expensive and prone to catastrophic forgetting, while existing memory-based methods rely on passive semantic matching that often retrieves noise. To address these challenges, we propose MemRL, a non-parametric approach that evolves via reinforcement learning on episodic memory. By decoupling stable reasoning from plastic memory, MemRL employs a Two-Phase Retrieval mechanism to filter noise and identify high-utility strategies through environmental feedback. Extensive experiments on HLE, BigCodeBench, ALFWorld, and Lifelong Agent Bench demonstrate that MemRL significantly outperforms state-of-the-art baselines, confirming that MemRL effectively reconciles the stability-plasticity dilemma, enabling continuous runtime improvement without weight updates. Code is available at https://github.com/MemTensor/MemRL.

Problem

Research questions and friction points this paper is trying to address.

self-evolution

episodic memory

catastrophic forgetting

semantic matching

stability-plasticity dilemma

Innovation

Methods, ideas, or system contributions that make the work stand out.

Episodic Memory

Reinforcement Learning

Self-Evolving Agents