ES-Mem: Event Segmentation-Based Memory for Long-Term Dialogue Agents

📅 2026-01-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing dialogue memory mechanisms often suffer from semantic fragmentation due to rigid memory granularity and struggle with precise contextual localization because flat retrieval fails to leverage discourse structural cues. Inspired by event segmentation theory, this work introduces dynamic event boundary detection into dialogue memory systems for the first time, proposing a hierarchical memory architecture. The approach employs a dynamic event segmentation module to partition long conversations into semantically coherent event units, enabling memory anchoring and hierarchical retrieval grounded in event-boundary semantics. This design effectively balances semantic integrity with fine-grained contextual localization. Empirical results demonstrate consistent performance gains over strong baselines on two memory benchmarks, while the event segmentation module exhibits strong generalization capabilities on dialogue segmentation datasets.

Technology Category

Application Category

📝 Abstract
Memory is critical for dialogue agents to maintain coherence and enable continuous adaptation in long-term interactions. While existing memory mechanisms offer basic storage and retrieval capabilities, they are hindered by two primary limitations: (1) rigid memory granularity often disrupts semantic integrity, resulting in fragmented and incoherent memory units; (2) prevalent flat retrieval paradigms rely solely on surface-level semantic similarity, neglecting the structural cues of discourse required to navigate and locate specific episodic contexts. To mitigate these limitations, drawing inspiration from Event Segmentation Theory, we propose ES-Mem, a framework incorporating two core components: (1) a dynamic event segmentation module that partitions long-term interactions into semantically coherent events with distinct boundaries; (2) a hierarchical memory architecture that constructs multi-layered memories and leverages boundary semantics to anchor specific episodic memory for precise context localization. Evaluations on two memory benchmarks demonstrate that ES-Mem yields consistent performance gains over baseline methods. Furthermore, the proposed event segmentation module exhibits robust applicability on dialogue segmentation datasets.
Problem

Research questions and friction points this paper is trying to address.

memory granularity
semantic integrity
flat retrieval
discourse structure
episodic context
Innovation

Methods, ideas, or system contributions that make the work stand out.

Event Segmentation
Hierarchical Memory
Dialogue Agents
Memory Architecture
Context Localization
🔎 Similar Papers
No similar papers found.
H
Huhai Zou
College of Computer Science, Chongqing University
T
Tianhao Sun
College of Computer Science, Chongqing University
C
Chuanjiang He
College of Computer Science, Chongqing University
Y
Yu Tian
Tsinghua University
Zhenyang Li
Zhenyang Li
Tsinghua University & The University of Hong Kong
AIComputer Vision & GraphicsData mining
Li Jin
Li Jin
Associate Professor of the Aerospace Information Research Institute, Chinese Academy of Sciences
Natural Language ProcessingSpatial-temporal Knowledge Graph
N
Nayu Liu
School of Computer Science and Technology, Tiangong University
J
Jiang Zhong
College of Computer Science, Chongqing University
K
Kaiwen Wei
College of Computer Science, Chongqing University