🤖 AI Summary
This work addresses the limitations of large language model agents in long-horizon tasks, where flat memory storage and shallow semantic retrieval hinder complex reasoning. To overcome this, we propose CompassMem, a novel framework that models event-centric memory as an event graph with explicit logical relations. Grounded in event segmentation theory, CompassMem constructs a navigable logical map that deeply couples memory organization with the reasoning process. This approach enables goal-directed memory retrieval and structured reasoning, significantly enhancing both memory accuracy and long-range reasoning capabilities across multiple backbone models on the LoCoMo and NarrativeQA benchmarks.
📝 Abstract
Large language models (LLMs) are increasingly deployed as intelligent agents that reason, plan, and interact with their environments. To effectively scale to long-horizon scenarios, a key capability for such agents is a memory mechanism that can retain, organize, and retrieve past experiences to support downstream decision-making. However, most existing approaches organize and store memories in a flat manner and rely on simple similarity-based retrieval techniques. Even when structured memory is introduced, existing methods often struggle to explicitly capture the logical relationships among experiences or memory units. Moreover, memory access is largely detached from the constructed structure and still depends on shallow semantic retrieval, preventing agents from reasoning logically over long-horizon dependencies. In this work, we propose CompassMem, an event-centric memory framework inspired by Event Segmentation Theory. CompassMem organizes memory as an Event Graph by incrementally segmenting experiences into events and linking them through explicit logical relations. This graph serves as a logic map, enabling agents to perform structured and goal-directed navigation over memory beyond superficial retrieval, progressively gathering valuable memories to support long-horizon reasoning. Experiments on LoCoMo and NarrativeQA demonstrate that CompassMem consistently improves both retrieval and reasoning performance across multiple backbone models.