Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents

πŸ“… 2026-01-05
πŸ›οΈ arXiv.org
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limitations of large language model agents in long-horizon reasoning, which stem from constrained context windows and the disjointed treatment of short- and long-term memory in existing approaches, lacking unified optimization and adaptability. To overcome these challenges, we propose the Agentic Memory (AgeMem) framework, which, for the first time, formulates memory management as learnable tool-augmented actions within the agent’s policy. This enables autonomous storage, retrieval, updating, and discarding of memories through a tool-based mechanism. We further introduce a three-stage progressive reinforcement learning curriculum combined with a stepwise GRPO algorithm to effectively mitigate sparse reward issues. Experimental results demonstrate that AgeMem significantly outperforms strong baselines across five long-horizon task benchmarks, achieving higher task completion rates, improved memory quality, and more efficient context utilization.

Technology Category

Application Category

πŸ“ Abstract
Large language model (LLM) agents face fundamental limitations in long-horizon reasoning due to finite context windows, making effective memory management critical. Existing methods typically handle long-term memory (LTM) and short-term memory (STM) as separate components, relying on heuristics or auxiliary controllers, which limits adaptability and end-to-end optimization. In this paper, we propose Agentic Memory (AgeMem), a unified framework that integrates LTM and STM management directly into the agent's policy. AgeMem exposes memory operations as tool-based actions, enabling the LLM agent to autonomously decide what and when to store, retrieve, update, summarize, or discard information. To train such unified behaviors, we propose a three-stage progressive reinforcement learning strategy and design a step-wise GRPO to address sparse and discontinuous rewards induced by memory operations. Experiments on five long-horizon benchmarks demonstrate that AgeMem consistently outperforms strong memory-augmented baselines across multiple LLM backbones, achieving improved task performance, higher-quality long-term memory, and more efficient context usage.
Problem

Research questions and friction points this paper is trying to address.

long-horizon reasoning
memory management
large language model agents
long-term memory
short-term memory
Innovation

Methods, ideas, or system contributions that make the work stand out.

Agentic Memory
unified memory management
tool-based memory actions
progressive reinforcement learning
long-horizon reasoning
πŸ”Ž Similar Papers
No similar papers found.
Y
Yi Yu
Alibaba Group
Liuyi Yao
Liuyi Yao
Alibaba Group
Yuexiang Xie
Yuexiang Xie
Alibaba Group
NLPAutoMLFederated Learning
Q
Qingquan Tan
School of Cyber Science and Engineering, Wuhan University
J
Jiaqi Feng
School of Cyber Science and Engineering, Wuhan University
Yaliang Li
Yaliang Li
Alibaba Group
Machine Learning
Libing Wu
Libing Wu
wuhan university