LifeBench: A Benchmark for Long-Horizon Multi-Source Memory

📅 2026-03-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitations of current AI memory evaluation frameworks, which predominantly focus on explicit memory while neglecting implicit forms such as procedural and habitual memory, and fail to emulate real-world memory tasks that are multi-source, long-term, and require behavioral inference. To bridge this gap, we propose LifeBench—a novel benchmark that integrates both explicit and implicit memory through a long-horizon, multi-source design. LifeBench is the first to incorporate non-declarative memory into AI evaluation, leveraging real-world priors—including anonymized social surveys, map APIs, and public holiday calendars—and employing a hierarchical parallel generation strategy grounded in cognitive science’s part-whole structural theory to construct dense, consistent, and diverse long-term memory events. Experiments reveal that state-of-the-art memory systems achieve only 55.2% accuracy on LifeBench, underscoring the significant challenges in cross-temporal, multimodal memory integration.

Technology Category

Application Category

📝 Abstract
Long-term memory is fundamental for personalized agents capable of accumulating knowledge, reasoning over user experiences, and adapting across time. However, existing memory benchmarks primarily target declarative memory, specifically semantic and episodic types, where all information is explicitly presented in dialogues. In contrast, real-world actions are also governed by non-declarative memory, including habitual and procedural types, and need to be inferred from diverse digital traces. To bridge this gap, we introduce Lifebench, which features densely connected, long-horizon event simulation. It pushes AI agents beyond simple recall, requiring the integration of declarative and non-declarative memory reasoning across diverse and temporally extended contexts. Building such a benchmark presents two key challenges: ensuring data quality and scalability. We maintain data quality by employing real-world priors, including anonymized social surveys, map APIs, and holiday-integrated calendars, thus enforcing fidelity, diversity and behavioral rationality within the dataset. Towards scalability, we draw inspiration from cognitive science and structure events according to their partonomic hierarchy; enabling efficient parallel generation while maintaining global coherence. Performance results show that top-tier, state-of-the-art memory systems reach just 55.2\% accuracy, highlighting the inherent difficulty of long-horizon retrieval and multi-source integration within our proposed benchmark. The dataset and data synthesis code are available at https://github.com/1754955896/LifeBench.
Problem

Research questions and friction points this paper is trying to address.

long-horizon memory
non-declarative memory
multi-source memory
memory benchmark
personalized agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

long-horizon memory
non-declarative memory
multi-source integration
partonomic hierarchy
event simulation
🔎 Similar Papers
No similar papers found.
Zihao Cheng
Zihao Cheng
Nanjing University of Post and telecommunication
security of CPSDoS attackspredictive controlmulti-agent systems
Weixin Wang
Weixin Wang
George Washingtong University
nonlinear filteringinertial sensors
Y
Yu Zhao
Huawei Technologies Co., Ltd.
Z
Ziyang Ren
Huawei Technologies Co., Ltd.
J
Jiaxuan Chen
State Key Laboratory for Novel Software Technology, Nanjing University; School of Artificial Intelligence, Nanjing University
R
Ruiyang Xu
Huawei Technologies Co., Ltd.
S
Shuai Huang
Huawei Technologies Co., Ltd.
Y
Yang Chen
Huawei Technologies Co., Ltd.
G
Guowei Li
Huawei Technologies Co., Ltd.
M
Mengshi Wang
Huawei Technologies Co., Ltd.
Y
Yi Xie
Huawei Technologies Co., Ltd.
R
Ren Zhu
Huawei Technologies Co., Ltd.
Zeren Jiang
Zeren Jiang
University of Oxford
Computer GraphicsDigital Human3D Vision
K
Keda Lu
Huawei Technologies Co., Ltd.
Y
Yihong Li
Huawei Technologies Co., Ltd.
Xiaoliang Wang
Xiaoliang Wang
Associate Professor of Computer Science, Nanjing University
Networking System
Liwei Liu
Liwei Liu
Shenzhen University
Biophotonics
Cam-Tu Nguyen
Cam-Tu Nguyen
Associate Professor of AI School, Nanjing University, China
Data MiningImage AnnotationText MiningMachine LearningGraphical Models