Enhancing Large Language Model with Self-Controlled Memory Framework

๐Ÿ“… 2023-04-26
๐Ÿ“ˆ Citations: 18
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Large language models (LLMs) suffer from long-range context loss due to input length constraints, hindering effective long-term memory retention. To address this, we propose the Self-Controlled Memory (SCM) frameworkโ€”a tripartite architecture comprising an LLM agent, a streaming memory store, and a learnable memory controller that jointly enable dynamic memory storage, autonomous retrieval decisions, and incremental memory updates. SCMโ€™s key innovations include: (i) zero-shot plug-and-play integration without model fine-tuning; (ii) native support for ultra-long-context processing; and (iii) an instruction-aligned zero-shot ensemble mechanism for robust memory utilization. Evaluated on long-horizon dialogue, book summarization, and meeting summarization tasks, SCM significantly improves key information recall rates and response informativeness, consistently outperforming state-of-the-art baselines across all metrics.
๐Ÿ“ Abstract
Large Language Models (LLMs) are constrained by their inability to process lengthy inputs, resulting in the loss of critical historical information. To address this limitation, in this paper, we propose the Self-Controlled Memory (SCM) framework to enhance the ability of LLMs to maintain long-term memory and recall relevant information. Our SCM framework comprises three key components: an LLM-based agent serving as the backbone of the framework, a memory stream storing agent memories, and a memory controller updating memories and determining when and how to utilize memories from memory stream. Additionally, the proposed SCM is able to process ultra-long texts without any modification or fine-tuning, which can integrate with any instruction following LLMs in a plug-and-play paradigm. Furthermore, we annotate a dataset to evaluate the effectiveness of SCM for handling lengthy inputs. The annotated dataset covers three tasks: long-term dialogues, book summarization, and meeting summarization. Experimental results demonstrate that our method achieves better retrieval recall and generates more informative responses compared to competitive baselines in long-term dialogues. (https://github.com/wbbeyourself/SCM4LLMs)
Problem

Research questions and friction points this paper is trying to address.

Enhance LLMs' ability to process lengthy inputs.
Propose Self-Controlled Memory (SCM) for long-term memory retention.
Evaluate SCM on tasks like dialogues and summarization.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Controlled Memory (SCM) enhances LLMs' long-term memory.
SCM integrates with LLMs in a plug-and-play manner.
SCM processes ultra-long texts without fine-tuning.
๐Ÿ”Ž Similar Papers
No similar papers found.
Xinnian Liang
Xinnian Liang
Bytedance Inc.
Large Language Model
B
Bin Wang
State Key Lab of Software Development Environment, Beihang University, Beijing, China
H
Huijia Huang
Harbin Institute of Technology, Harbin, China
Shuangzhi Wu
Shuangzhi Wu
Bytedance
Machine TranslationDeep LearningNatural Language Processing
P
Peihao Wu
ByteDance AI Lab, Beijing, China
L
Lu Lu
ByteDance AI Lab, Beijing, China
Zejun Ma
Zejun Ma
Bytedance
machine learningdeep learningmultimodal
Zhoujun Li
Zhoujun Li
Beihang University
Artificial IntelligentNatural Language ProcessingNetwork Security