๐ค AI Summary
This work addresses the limitations of existing vector-similarity-based long-term dialogue memory retrieval methods, which are prone to interference from irrelevant context, leading to redundant evidence, low precision, and high computational overhead. To overcome these issues, the authors propose a two-tier eventโturn memory system that leverages a large language model (LLM) to first generate event-level summaries as semantic anchors and then employs a reasoning mechanism to precisely select relevant dialogue turns. Evaluated on the LoCoMo10 benchmark, the approach significantly outperforms current state-of-the-art methods, achieving the highest F1 scores in four out of five question categories and improving adversarial F1 from 0.54 to 0.78, while reducing the number of retrieved turns by an order of magnitude.
๐ Abstract
Long-term conversational large language model (LLM) agents require memory systems that can recover relevant evidence from historical interactions without overwhelming the answer stage with irrelevant context. However, existing memory systems, including hierarchical ones, still often rely solely on vector similarity for retrieval. It tends to produce bloated evidence sets: adding many superficially similar dialogue turns yields little additional recall, but lowers retrieval precision, increases answer-stage context cost, and makes retrieved memories harder to inspect and manage. To address this, we propose HiGMem (Hierarchical and LLM-Guided Memory System), a two-level event-turn memory system that allows LLMs to use event summaries as semantic anchors to predict which related turns are worth reading. This allows the model to inspect high-level event summaries first and then focus on a smaller set of potentially useful turns, providing a concise and reliable evidence set through reasoning, while avoiding the retrieval overhead that would be excessively high compared to vector retrieval. On the LoCoMo10 benchmark, HiGMem achieves the best F1 on four of five question categories and improves adversarial F1 from 0.54 to 0.78 over A-Mem, while retrieving an order of magnitude fewer turns. Code is publicly available at https://github.com/ZeroLoss-Lab/HiGMem.