🤖 AI Summary
Current AI agents treat long-term memory as static records, leading to uncontrolled growth, lack of semantic revision, capacity-driven forgetting, and read-only retrieval—limitations that undermine memory efficacy. This work proposes the Governed Evolving Memory (GEM) framework, which reconceptualizes long-term memory as a trajectory of states rather than isolated snapshots. GEM introduces state-level correctness conditions and defines four state-aware operations: ingestion, revision, forgetting, and retrieval. The authors implement MemState, a prototype system based on property graphs, to demonstrate GEM’s feasibility. Their evaluation reveals a fundamental gap between conventional database engines and memory-native architectures, establishing a memory-centric paradigm for data management and charting directions for future research in intelligent agent memory systems.
📝 Abstract
Long-running AI agents need persistent memory. Memory supports learning across sessions, reduces repeated context injection, and enables auditing of past decisions. Current agent memory systems and database paradigms treat memory as storage. They localize correctness at records, embeddings, or edges. Each supplies only some of the capabilities that long-term memory requires. The result is four recurring failure modes: unregulated growth, missing semantic revision, capacity-driven forgetting, and read-only retrieval. In our vision, long-term agent memory is a new data-management workload. Its correctness is a property of the state trajectory, not of individual records. We formalize this as Governed Evolving Memory (GEM). GEM replaces record-level database operations with four state-level operators: ingestion, revision, forgetting, and retrieval. Six correctness conditions govern how the state evolves. Three structural observations establish that no record-level system can satisfy these conditions, regardless of the storage model. We realize the abstraction in MemState, a prototype on a property-graph backend. MemState validates feasibility and exposes the gap to a native engine. We outline three research directions that define memory-centric data management as a workload.