๐ค AI Summary
Existing studies offer conflicting conclusions regarding the effectiveness of graph-based structures in dialogue memory systems, making it difficult to isolate key design factors. This work proposes the first unified and modular framework for analyzing dialogue memory mechanisms, accommodating both graph-based and non-graph approaches. Through staged controlled experiments on LongMemEval and HaluMem, the study systematically evaluates the impact of core componentsโmemory representation, organization, maintenance, and retrieval. Findings reveal that performance differences primarily stem from underlying system configurations rather than specific architectural innovations. Moreover, the analysis identifies several robust and consistently strong baselines, establishing reproducible benchmarks and actionable design guidelines for future research in dialogue memory systems.
๐ Abstract
Graph structures are increasingly used in dialog memory systems, but empirical findings on their effectiveness remain inconsistent, making it unclear which design choices truly matter. We present an experimental, system-oriented analysis of long-term dialog memory architectures. We introduce a unified framework that decomposes dialog memory systems into core components and supports both graph-based and non-graph approaches. Under this framework, we conduct controlled, stage-wise experiments on LongMemEval and HaluMem, comparing common design choices in memory representation, organization, maintenance, and retrieval. Our results show that many performance differences are driven by foundational system settings rather than specific architectural innovations. Based on these findings, we identify stable and reliable strong baselines for future dialog memory research. Code are available at https://github.com/AvatarMemory/UnifiedMem