🤖 AI Summary
This work addresses the limitation of existing agent memory systems, which predominantly focus on semantic memory and struggle to effectively model coherent events involving characters, time, and location. Inspired by dramatic theory, we propose a role-scene-driven episodic memory representation that constructs a dual-memory architecture centered on character profiles and 3D scenes (encompassing time, place, and theme). This architecture is synergistically integrated with graph-structured semantic memory to enable complementary retrieval between episodic and semantic memories. To our knowledge, this is the first effort to incorporate dramatic theory into agent memory modeling. Empirical evaluations demonstrate consistent improvements across multiple datasets, with an average gain of 8.11% in F1 score and 10.21% in LLM-as-a-Judge ratings, particularly excelling in open-ended and time-sensitive dialogue tasks.
📝 Abstract
Episodic memory is a central component of human memory, which refers to the ability to recall coherent events grounded in who, when, and where. However, most agent memory systems only emphasize semantic recall and treat experience as structures such as key-value, vector, or graph, which makes them struggle to represent and retrieve coherent events. To address this challenge, we propose a Character-and-Scene based memory architecture(CAST) inspired by dramatic theory. Specifically, CAST constructs 3D scenes (time/place/topic) and organizes them into character profiles that summarize the events of a character to represent episodic memory. Moreover, CAST complements this episodic memory with a graph-based semantic memory, which yields a robust dual memory design. Experiments demonstrate that CAST has averagely improved 8.11% F1 and 10.21% J(LLM-as-a-Judge) than baselines on various datasets, especially on open and time-sensitive conversational questions.