HiMeS: Hippocampus-inspired Memory System for Personalized AI Assistants

📅 2026-01-06

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the limitations of traditional retrieval-augmented generation (RAG) in personalized AI assistants, which suffer from constrained memory capacity and insufficient integration with dialogue history, often leading to redundant clarifications and irrelevant document retrieval. Inspired by the hippocampal–neocortical memory system, we propose an end-to-end architecture that integrates short-term and long-term memory mechanisms. Our approach employs reinforcement learning to compress dialogue context and pre-retrieve relevant knowledge, introduces a short-term memory extractor that mimics hippocampal–prefrontal coordination, and constructs a partitioned long-term memory network to store user-specific information and rerank retrieved results. Evaluated on a real-world industrial dataset, the proposed method significantly outperforms cascaded RAG baselines, and ablation studies confirm the effectiveness and necessity of both memory components.

Technology Category

Application Category

📝 Abstract

Large language models (LLMs) power many interactive systems such as chatbots, customer-service agents, and personal assistants. In knowledge-intensive scenarios requiring user-specific personalization, conventional retrieval-augmented generation (RAG) pipelines exhibit limited memory capacity and insufficient coordination between retrieval mechanisms and user-specific conversational history, leading to redundant clarification, irrelevant documents, and degraded user experience. Inspired by the hippocampus-neocortex memory mechanism, we propose HiMeS, an AI-assistant architecture that fuses short-term and long-term memory. Our contributions are fourfold: (1) A short-term memory extractor is trained end-to-end with reinforcement learning to compress recent dialogue and proactively pre-retrieve documents from the knowledge base, emulating the cooperative interaction between the hippocampus and prefrontal cortex. (2) A partitioned long-term memory network stores user-specific information and re-ranks retrieved documents, simulating distributed cortical storage and memory reactivation. (3) On a real-world industrial dataset, HiMeS significantly outperforms a cascaded RAG baseline on question-answering quality. (4) Ablation studies confirm the necessity of both memory modules and suggest a practical path toward more reliable, context-aware, user-customized LLM-based assistants.

Problem

Research questions and friction points this paper is trying to address.

personalized AI assistants

retrieval-augmented generation

memory capacity

user-specific personalization

conversational history

Innovation

Methods, ideas, or system contributions that make the work stand out.

hippocampus-inspired memory

retrieval-augmented generation

short-term memory