Mem-$π$: Adaptive Memory through Learning When and What to Generate

📅 2026-05-20

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

This work addresses the limitations of traditional memory-augmented agents, which rely on static retrieval mechanisms and struggle to dynamically generate task-relevant guidance. The authors propose Mem-π, a novel framework that decouples decision-making from content generation for the first time. By leveraging a separate language or vision-language model conditioned on contextual cues, Mem-π dynamically determines whether—and what—to generate as task-specific instructions. A specially designed reinforcement learning objective enables the agent to actively suppress unhelpful generations, substantially improving both the precision and efficiency of guidance. Evaluated across diverse agent benchmarks—including web navigation, terminal-based tool use, and text-based embodied interaction—Mem-π consistently outperforms existing methods, achieving over a 30% relative performance gain in web navigation tasks.

📝 Abstract

We present Mem-$π$, a framework for adaptive memory in large language model (LLM) agents, where useful guidance is generated on demand rather than retrieved from external memory stores. Existing memory-augmented agents typically rely on similarity-based retrieval from episodic memory banks or skill libraries, returning static entries that often misalign with the current context. In contrast, Mem-$π$ uses a dedicated language or vision-language model with its own parameters, separate from the downstream agent, to generate context-specific guidance for complex tasks. Conditioned on the current agent context, the model jointly decides when to produce guidance and what guidance to produce. We train it with a decision-content decoupled reinforcement learning (RL) objective, enabling it to abstain when generation would not help and otherwise produce concise, useful guidance. Across diverse agentic benchmarks spanning web navigation, terminal-based tool use, and text-based embodied interaction, Mem-$π$ consistently outperforms retrieval-based and prior RL-optimized memory baselines, achieving over 30% relative improvement on web navigation tasks.

Problem

Research questions and friction points this paper is trying to address.

adaptive memory

memory-augmented agents

context misalignment

static retrieval

LLM agents

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive memory

on-demand generation

decision-content decoupled RL