Pre-Storage Reasoning for Episodic Memory: Shifting Inference Burden to Memory for Personalized Dialogue

📅 2025-09-13

📈 Citations: 0

✨ Influential: 0

career value

227K/year

🤖 AI Summary

To address the weak long-term memory capability, heavy response-time reasoning overhead, and excessive reliance on large-model scale in conversational AI, this paper proposes PREMem—a framework that shifts complex reasoning to the memory construction phase to enable cross-session information integration and personalized response generation. Its core contribution lies in (1) categorizing memory into factual, experiential, and subjective fragments; (2) explicitly modeling evolutionary relationships—such as expansion, transformation, and inference—across sessions for the first time; and (3) performing fine-grained extraction, classification, and relational modeling during pre-storage to achieve structural organization and semantic enrichment. Experiments demonstrate substantial improvements in long-term memory performance across model scales: small models achieve performance comparable to large ones, while maintaining high efficiency and robustness under constrained token budgets.

Technology Category

Application Category

📝 Abstract

Effective long-term memory in conversational AI requires synthesizing information across multiple sessions. However, current systems place excessive reasoning burden on response generation, making performance significantly dependent on model sizes. We introduce PREMem (Pre-storage Reasoning for Episodic Memory), a novel approach that shifts complex reasoning processes from inference to memory construction. PREMem extracts fine-grained memory fragments categorized into factual, experiential, and subjective information; it then establishes explicit relationships between memory items across sessions, capturing evolution patterns like extensions, transformations, and implications. By performing this reasoning during pre-storage rather than when generating a response, PREMem creates enriched representations while reducing computational demands during interactions. Experiments show significant performance improvements across all model sizes, with smaller models achieving results comparable to much larger baselines while maintaining effectiveness even with constrained token budgets. Code and dataset are available at https://github.com/sangyeop-kim/PREMem.

Problem

Research questions and friction points this paper is trying to address.

Shifting reasoning burden from response generation to memory

Synthesizing information across multiple dialogue sessions effectively

Reducing computational demands while maintaining personalized dialogue performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Shifts reasoning from inference to memory construction

Extracts categorized memory fragments with explicit relationships

Performs pre-storage reasoning to reduce computational demands

🔎 Similar Papers

OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering