A Simple Yet Strong Baseline for Long-Term Conversational Memory of LLM Agents

📅 2025-11-21

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

To address the challenge of maintaining long-term coherence and personalization in multi-turn conversational agents powered by large language models (LLMs), this paper proposes an event-centric memory architecture. The architecture models dialogue history as structured propositional units—each encoding participants, temporal cues, and local context—inspired by neo-Davidsonian event semantics, thereby enabling non-compressive, fine-grained, and semantically complete memory representation. Further, it constructs a heterogeneous conversation-EDU-parameter graph, integrating enhanced Elementary Discourse Units (EDUs) with entity normalization and provenance annotation to support joint retrieval via dense indexing, LLM-based filtering, and graph propagation. Evaluated on LoCoMo and LongMemEval$_S$, the method achieves or surpasses state-of-the-art performance while significantly reducing required context length for question answering, demonstrating the efficacy and practicality of event-level memory in long-horizon dialogue.

Technology Category

Application Category

📝 Abstract

LLM-based conversational agents still struggle to maintain coherent, personalized interaction over many sessions: fixed context windows limit how much history can be kept in view, and most external memory approaches trade off between coarse retrieval over large chunks and fine-grained but fragmented views of the dialogue. Motivated by neo-Davidsonian event semantics, we propose an event-centric alternative that represents conversational history as short, event-like propositions which bundle together participants, temporal cues, and minimal local context, rather than as independent relation triples or opaque summaries. In contrast to work that aggressively compresses or forgets past content, our design aims to preserve information in a non-compressive form and make it more accessible, rather than more lossy. Concretely, we instruct an LLM to decompose each session into enriched elementary discourse units (EDUs) -- self-contained statements with normalized entities and source turn attributions -- and organize sessions, EDUs, and their arguments in a heterogeneous graph that supports associative recall. On top of this representation we build two simple retrieval-based variants that use dense similarity search and LLM filtering, with an optional graph-based propagation step to connect and aggregate evidence across related EDUs. Experiments on the LoCoMo and LongMemEval$_S$ benchmarks show that these event-centric memories match or surpass strong baselines, while operating with much shorter QA contexts. Our results suggest that structurally simple, event-level memory provides a principled and practical foundation for long-horizon conversational agents. Our code and data will be released at https://github.com/KevinSRR/EMem.

Problem

Research questions and friction points this paper is trying to address.

LLM agents struggle with long-term conversational memory across multiple sessions

Fixed context windows limit history retention in conversational agents

Existing memory approaches trade off between coarse retrieval and fragmented dialogue views

Innovation

Methods, ideas, or system contributions that make the work stand out.

Event-centric memory using short propositions

Heterogeneous graph organizing sessions and EDUs

Dense similarity search with LLM filtering retrieval

🔎 Similar Papers

Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models