Chain-of-Memory: Lightweight Memory Construction with Dynamic Evolution for LLM Agents

📅 2026-01-14
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF

career value

235K/year
🤖 AI Summary
Current external memory systems for large language model agents often suffer from high construction costs and a disconnect between retrieval and reasoning, leading to substantial computational overhead and limited reasoning accuracy. This work proposes the Chain-of-Memory (CoM) framework, which overturns the prevailing “heavy construction, light utilization” paradigm by introducing a novel “light construction, strong utilization” memory architecture. CoM organizes retrieved fragments into coherent reasoning paths through lightweight memory construction, dynamically evolving memory chains, and an adaptive truncation mechanism that filters out noise. Evaluated on the LongMemEval and LoCoMo benchmarks, CoM achieves accuracy improvements of 7.5%–10.4% while consuming only 2.7% of the tokens and 6.0% of the latency required by more complex architectures, thereby significantly balancing efficiency with long-horizon reasoning performance.

Technology Category

Application Category

📝 Abstract
External memory systems are pivotal for enabling Large Language Model (LLM) agents to maintain persistent knowledge and perform long-horizon decision-making. Existing paradigms typically follow a two-stage process: computationally expensive memory construction (e.g., structuring data into graphs) followed by naive retrieval-augmented generation. However, our empirical analysis reveals two fundamental limitations: complex construction incurs high costs with marginal performance gains, and simple context concatenation fails to bridge the gap between retrieval recall and reasoning accuracy. To address these challenges, we propose CoM (Chain-of-Memory), a novel framework that advocates for a paradigm shift toward lightweight construction paired with sophisticated utilization. CoM introduces a Chain-of-Memory mechanism that organizes retrieved fragments into coherent inference paths through dynamic evolution, utilizing adaptive truncation to prune irrelevant noise. Extensive experiments on the LongMemEval and LoCoMo benchmarks demonstrate that CoM outperforms strong baselines with accuracy gains of 7.5%-10.4%, while drastically reducing computational overhead to approximately 2.7% of token consumption and 6.0% of latency compared to complex memory architectures.
Problem

Research questions and friction points this paper is trying to address.

external memory
memory construction
retrieval-augmented generation
reasoning accuracy
computational overhead
Innovation

Methods, ideas, or system contributions that make the work stand out.

Chain-of-Memory
lightweight memory construction
dynamic evolution
adaptive truncation
retrieval-augmented generation
🔎 Similar Papers
No similar papers found.
X
Xiucheng Xu
State Key Laboratory of AI Safety, Beijing, 100086; Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Bingbing Xu
Bingbing Xu
Associate professor, Institute of Computing Technology, Chinese Academy of Sciences
Graph Neural NetworksNetwork Embedding
Xueyun Tian
Xueyun Tian
Institute of Computing Technology
Multimodal GenerationMLLM
Z
Zihe Huang
State Key Laboratory of AI Safety, Beijing, 100086; Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
R
Rongxin Chen
State Key Laboratory of AI Safety, Beijing, 100086; Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences
Yunfan Li
Yunfan Li
Sichuan University, College of Computer Science, Chengdu, China
Clustering
H
Huawei Shen
State Key Laboratory of AI Safety, Beijing, 100086; Institute of Computing Technology, Chinese Academy of Sciences; University of Chinese Academy of Sciences