From Experience to Strategy: Empowering LLM Agents with Trainable Graph Memory

📅 2025-11-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language model (LLM) agents suffer from poor strategic transfer and limited interpretability in open environments due to volatile implicit memory and inflexible explicit memory. Method: We propose the first trainable multi-layer graph memory framework, which models experience trajectories as structured decision graphs; extracts strategy-level metacognitive representations via graph neural networks; and integrates metacognitive prompting with context-aware memory retrieval. Memory weights are dynamically optimized through reinforcement learning to enable adaptive experience recall and reasoning enhancement. Contribution/Results: Experiments demonstrate substantial improvements in strategic reasoning capability and robust generalization across complex tasks, while preserving decision interpretability and enabling continual plasticity—addressing core limitations of existing memory-augmented LLM agents.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) based agents have demonstrated remarkable potential in autonomous task-solving across complex, open-ended environments. A promising approach for improving the reasoning capabilities of LLM agents is to better utilize prior experiences in guiding current decisions. However, LLMs acquire experience either through implicit memory via training, which suffers from catastrophic forgetting and limited interpretability, or explicit memory via prompting, which lacks adaptability. In this paper, we introduce a novel agent-centric, trainable, multi-layered graph memory framework and evaluate how context memory enhances the ability of LLMs to utilize parametric information. The graph abstracts raw agent trajectories into structured decision paths in a state machine and further distills them into high-level, human-interpretable strategic meta-cognition. In order to make memory adaptable, we propose a reinforcement-based weight optimization procedure that estimates the empirical utility of each meta-cognition based on reward feedback from downstream tasks. These optimized strategies are then dynamically integrated into the LLM agent's training loop through meta-cognitive prompting. Empirically, the learnable graph memory delivers robust generalization, improves LLM agents'strategic reasoning performance, and provides consistent benefits during Reinforcement Learning (RL) training.
Problem

Research questions and friction points this paper is trying to address.

Enhancing LLM agents' reasoning through trainable graph memory systems
Addressing catastrophic forgetting and limited interpretability in LLM experience utilization
Optimizing strategic meta-cognition integration via reinforcement-based memory adaptation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Trainable multi-layered graph memory framework
Reinforcement-based weight optimization for strategies
Meta-cognitive prompting integrates strategies into training
🔎 Similar Papers
No similar papers found.
S
Siyu Xia
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Zekun Xu
Zekun Xu
Amazon
Machine LearningStatistical Model
Jiajun Chai
Jiajun Chai
Meituan Inc.
Reinforcement LearningLLMsAgentic Learning
W
Wentian Fan
Nanjing University of Posts and Telecommunications
Y
Yan Song
AI Centre, Department of Computer Science, University College London, London, UK
X
Xiaohan Wang
Meituan
Guojun Yin
Guojun Yin
Meituan, University of Science and Technology of China
MultimodalityComputer VisionFoundation ModelsDeep LearningImage/Video Processing
W
Wei Lin
Meituan
H
Haifeng Zhang
Institute of Automation, Chinese Academy of Sciences, Beijing, China
J
Jun Wang
AI Centre, Department of Computer Science, University College London, London, UK