🤖 AI Summary
This work addresses the limitations of traditional large language model (LLM) agents whose memory systems rely on static, handcrafted operations, leading to poor adaptability across diverse interaction scenarios and inefficiency over long interaction histories. The authors propose reframing memory operations as learnable and evolvable memory skills within a tripartite architecture comprising a controller, an executor, and a designer. The controller dynamically selects appropriate skills, the LLM-based executor performs memory generation, and the designer iteratively refines the skill set using challenging cases, thereby establishing a closed-loop self-evolution mechanism. This approach is the first to model memory management as a jointly evolvable skill system, overcoming the constraints of fixed operations. Experiments demonstrate that the proposed system significantly outperforms strong baselines across multiple benchmarks—including LoCoMo, LongMemEval, HotpotQA, and ALFWorld—exhibiting superior generalization and memory adaptability.
📝 Abstract
Most Large Language Model (LLM) agent memory systems rely on a small set of static, hand-designed operations for extracting memory. These fixed procedures hard-code human priors about what to store and how to revise memory, making them rigid under diverse interaction patterns and inefficient on long histories. To this end, we present \textbf{MemSkill}, which reframes these operations as learnable and evolvable memory skills, structured and reusable routines for extracting, consolidating, and pruning information from interaction traces. Inspired by the design philosophy of agent skills, MemSkill employs a \emph{controller} that learns to select a small set of relevant skills, paired with an LLM-based \emph{executor} that produces skill-guided memories. Beyond learning skill selection, MemSkill introduces a \emph{designer} that periodically reviews hard cases where selected skills yield incorrect or incomplete memories, and evolves the skill set by proposing refinements and new skills. Together, MemSkill forms a closed-loop procedure that improves both the skill-selection policy and the skill set itself. Experiments on LoCoMo, LongMemEval, HotpotQA, and ALFWorld demonstrate that MemSkill improves task performance over strong baselines and generalizes well across settings. Further analyses shed light on how skills evolve, offering insights toward more adaptive, self-evolving memory management for LLM agents.