CloneMem: Benchmarking Long-Term Memory for AI Clones

📅 2026-01-11

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

Current evaluations of AI memory predominantly rely on fragmented dialogue histories, which inadequately assess the ability to model individuals’ thoughts and behaviors across long-term, continuous life trajectories. This work proposes the first longitudinal memory benchmark grounded in real-world, non-dialogue digital traces—such as personal diaries, emails, and social media posts—employing a hierarchical data framework to preserve temporal coherence. A state-tracking task is introduced to evaluate how well AI systems remember and reason about the evolution of personal states over time. Experimental results reveal that existing memory mechanisms perform poorly on this benchmark, highlighting significant challenges in modeling long-term, personalized memory in authentic life contexts. By moving beyond the constraints of conversational data, this study establishes a new paradigm for research on AI personas and personalized memory systems.

Technology Category

Application Category

📝 Abstract

AI Clones aim to simulate an individual's thoughts and behaviors to enable long-term, personalized interaction, placing stringent demands on memory systems to model experiences, emotions, and opinions over time. Existing memory benchmarks primarily rely on user-agent conversational histories, which are temporally fragmented and insufficient for capturing continuous life trajectories. We introduce CloneMem, a benchmark for evaluating longterm memory in AI Clone scenarios grounded in non-conversational digital traces, including diaries, social media posts, and emails, spanning one to three years. CloneMem adopts a hierarchical data construction framework to ensure longitudinal coherence and defines tasks that assess an agent's ability to track evolving personal states. Experiments show that current memory mechanisms struggle in this setting, highlighting open challenges for life-grounded personalized AI. Code and dataset are available at https://github.com/AvatarMemory/CloneMemBench

Problem

Research questions and friction points this paper is trying to address.

AI Clones

long-term memory

memory benchmark

personalized AI

digital traces

Innovation

Methods, ideas, or system contributions that make the work stand out.

long-term memory

AI clones

digital traces