Membox: Weaving Topic Continuity into Long-Range Memory for LLM Agents

📅 2026-01-07

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the challenge that existing large language model agents struggle to maintain thematic continuity in dialogue, often resulting in fragmented narratives and broken causal chains. To overcome this, the authors propose Membox, a novel hierarchical memory architecture that introduces thematic continuity modeling at the storage stage. Membox employs Topic Loom to aggregate thematically related dialogue segments into coherent “memory boxes” and utilizes Trace Weaver to construct long-range event timelines across conversational discontinuities, enabling cognitively inspired, efficient memory organization. Departing from the conventional “fragmented storage–retrieval reconstruction” paradigm, Membox achieves up to a 68% relative improvement in temporal reasoning F1 score on the LoCoMo benchmark, significantly outperforming baselines such as Mem0 and A-MEM while using fewer context tokens, thereby achieving both higher efficiency and superior performance.

Technology Category

Application Category

📝 Abstract

Human-agent dialogues often exhibit topic continuity-a stable thematic frame that evolves through temporally adjacent exchanges-yet most large language model (LLM) agent memory systems fail to preserve it. Existing designs follow a fragmentation-compensation paradigm: they first break dialogue streams into isolated utterances for storage, then attempt to restore coherence via embedding-based retrieval. This process irreversibly damages narrative and causal flow, while biasing retrieval towards lexical similarity. We introduce membox, a hierarchical memory architecture centered on a Topic Loom that continuously monitors dialogue in a sliding-window fashion, grouping consecutive same-topic turns into coherent"memory boxes"at storage time. Sealed boxes are then linked by a Trace Weaver into long-range event-timeline traces, recovering macro-topic recurrences across discontinuities. Experiments on LoCoMo demonstrate that Membox achieves up to 68% F1 improvement on temporal reasoning tasks, outperforming competitive baselines (e.g., Mem0, A-MEM). Notably, Membox attains these gains while using only a fraction of the context tokens required by existing methods, highlighting a superior balance between efficiency and effectiveness. By explicitly modeling topic continuity, Membox offers a cognitively motivated mechanism for enhancing both coherence and efficiency in LLM agents.

Problem

Research questions and friction points this paper is trying to address.

topic continuity

LLM agent memory

dialogue coherence

long-range memory

temporal reasoning

Innovation

Methods, ideas, or system contributions that make the work stand out.

topic continuity

hierarchical memory architecture

memory boxes