Hierarchical Memory Orchestration for Personalized Persistent Agents

📅 2026-04-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenges posed by long-term memory accumulation on personal devices, where retrieval noise and computational latency hinder agent reasoning and personalization. To mitigate these issues, the authors propose a Hierarchical Memory Orchestration (HMO) framework that organizes memory into three structured tiers. By integrating user profiling with context-aware relevance assessment, HMO dynamically maintains a compact primary cache and selectively promotes high-value memories to an active layer. This approach enables efficient memory scheduling and facilitates precise personalized responses. The method achieves state-of-the-art performance across multiple benchmarks and demonstrates significant improvements in agent fluency and adaptability when deployed in real-world systems such as OpenClaw.
📝 Abstract
While long-term memory is essential for intelligent agents to maintain consistent historical awareness, the accumulation of extensive interaction data often leads to performance bottlenecks. Naive storage expansion increases retrieval noise and computational latency, overwhelming the reasoning capacity of models deployed on constrained personal devices. To address this, we propose Hierarchical Memory Orchestration (HMO), a framework that organizes interaction history into a three-tiered directory driven by user-centric contextual relevance. Our system maintains a compact primary cache, coupling recent and pivotal memories with an evolving user profile to ensure agent reasoning remains aligned with individual behavioral traits. This primary cache is complemented by a high-priority secondary layer, both of which are managed within a global archive of the full interaction history. Crucially, the user persona dictates memory redistribution across this hierarchy, promoting records mapped to long-term patterns toward more active tiers while relegating less relevant information. This targeted orchestration surfaces historical knowledge precisely when needed while maintaining a lean and efficient active search space. Evaluations on multiple benchmarks achieve state-of-the-art performance. Real-world deployments in ecosystems like OpenClaw demonstrate that HMO significantly enhances agent fluidity and personalization.
Problem

Research questions and friction points this paper is trying to address.

long-term memory
performance bottlenecks
retrieval noise
computational latency
personalized agents
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical Memory Orchestration
Personalized Agents
User-Centric Memory
Contextual Relevance
Memory Tiering
🔎 Similar Papers
No similar papers found.
J
Junming Liu
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Y
Yifei Sun
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Weihua Cheng
Weihua Cheng
Shanghaitech University
LLMAgentKnowledge Tracing
H
Haodong Lei
Shanghai Artificial Intelligence Laboratory, Shanghai, China
Yuqi Li
Yuqi Li
The City College of New York, the City University of New York
Model CompressComputer Vision
Yirong Chen
Yirong Chen
Stanford University
Ding Wang
Ding Wang
Shanghai AI Lab
Artificial IntelligenceAgentic SystemDigital Twin