MeMo: Towards Language Models with Associative Memory Mechanisms

📅 2025-02-18

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

Transformer-based large language models (LLMs) fundamentally lack explicit, interpretable text memory mechanisms. To address this, we propose MeMo—a novel architecture embodying the “memory-before-learning” paradigm. MeMo introduces a hierarchical associative memory module enabling explicit, transparent, and controllable storage, retrieval, and forgetting of token-level textual content, thereby overcoming the limitations of implicit, parameterized memory. Its core contributions are threefold: (1) a fully editable and interpretable explicit memory interface; (2) precise memory writing and retrieval without any training or parameter updates; and (3) flexible support for both single-layer and cross-layer associative modeling. Extensive experiments demonstrate that MeMo significantly enhances memory controllability, traceability, and intervenability—while maintaining computational efficiency—offering a principled pathway toward human-like memory mechanisms in foundation models.

Technology Category

Application Category

📝 Abstract

Memorization is a fundamental ability of Transformer-based Large Language Models, achieved through learning. In this paper, we propose a paradigm shift by designing an architecture to memorize text directly, bearing in mind the principle that memorization precedes learning. We introduce MeMo, a novel architecture for language modeling that explicitly memorizes sequences of tokens in layered associative memories. By design, MeMo offers transparency and the possibility of model editing, including forgetting texts. We experimented with the MeMo architecture, showing the memorization power of the one-layer and the multi-layer configurations.

Problem

Research questions and friction points this paper is trying to address.

Enhance language models with memory

Direct text memorization architecture

Enable model editing and forgetting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Associative Memory Mechanisms

Explicit Token Sequence Memorization

Transparent Model Editing and Forgetting

🔎 Similar Papers

Chrono: A Simple Blueprint for Representing Time in MLLMs