🤖 AI Summary
This work addresses the challenges posed by long-term memory accumulation on personal devices, where retrieval noise and computational latency hinder agent reasoning and personalization. To mitigate these issues, the authors propose a Hierarchical Memory Orchestration (HMO) framework that organizes memory into three structured tiers. By integrating user profiling with context-aware relevance assessment, HMO dynamically maintains a compact primary cache and selectively promotes high-value memories to an active layer. This approach enables efficient memory scheduling and facilitates precise personalized responses. The method achieves state-of-the-art performance across multiple benchmarks and demonstrates significant improvements in agent fluency and adaptability when deployed in real-world systems such as OpenClaw.
📝 Abstract
While long-term memory is essential for intelligent agents to maintain consistent historical awareness, the accumulation of extensive interaction data often leads to performance bottlenecks. Naive storage expansion increases retrieval noise and computational latency, overwhelming the reasoning capacity of models deployed on constrained personal devices. To address this, we propose Hierarchical Memory Orchestration (HMO), a framework that organizes interaction history into a three-tiered directory driven by user-centric contextual relevance. Our system maintains a compact primary cache, coupling recent and pivotal memories with an evolving user profile to ensure agent reasoning remains aligned with individual behavioral traits. This primary cache is complemented by a high-priority secondary layer, both of which are managed within a global archive of the full interaction history. Crucially, the user persona dictates memory redistribution across this hierarchy, promoting records mapped to long-term patterns toward more active tiers while relegating less relevant information. This targeted orchestration surfaces historical knowledge precisely when needed while maintaining a lean and efficient active search space. Evaluations on multiple benchmarks achieve state-of-the-art performance. Real-world deployments in ecosystems like OpenClaw demonstrate that HMO significantly enhances agent fluidity and personalization.