🤖 AI Summary
Existing LLM agents rely on unstructured memory—such as raw observations, summaries, or retrieved snippets—limiting their capacity for complex reasoning and multi-step planning.
Method: We propose the Memory Graph: a dynamically evolving knowledge graph serving as a world model that unifies semantic and episodic memory, constructed online and updated incrementally during interaction in text-based games. Our approach integrates episodic memory encoding, an LLM-augmented planner, graph neural network–assisted retrieval, and trajectory-driven graph updating.
Contribution/Results: Experiments demonstrate substantial improvements over baselines—including history-based recall, summary memory, and reinforcement learning—on challenging text games. On multi-hop question answering, it matches the performance of dedicated knowledge graph methods. This work introduces the first dynamic memory graph architecture for LLM agents, establishing an interpretable, evolvable, structured memory paradigm for embodied reasoning and continual learning.
📝 Abstract
Advancements in the capabilities of Large Language Models (LLMs) have created a promising foundation for developing autonomous agents. With the right tools, these agents could learn to solve tasks in new environments by accumulating and updating their knowledge. Current LLM-based agents process past experiences using a full history of observations, summarization, retrieval augmentation. However, these unstructured memory representations do not facilitate the reasoning and planning essential for complex decision-making. In our study, we introduce AriGraph, a novel method wherein the agent constructs and updates a memory graph that integrates semantic and episodic memories while exploring the environment. We demonstrate that our Ariadne LLM agent, consisting of the proposed memory architecture augmented with planning and decision-making, effectively handles complex tasks within interactive text game environments difficult even for human players. Results show that our approach markedly outperforms other established memory methods and strong RL baselines in a range of problems of varying complexity. Additionally, AriGraph demonstrates competitive performance compared to dedicated knowledge graph-based methods in static multi-hop question-answering.