🤖 AI Summary
Current agent systems struggle to efficiently evolve skills at test time, often relying on hard-coded strategies or costly updates to large model parameters, and lack the capability to continuously optimize the skill evolution mechanism itself. This work proposes HiSME—a lightweight hierarchical skill meta-evolution framework—that, for the first time, directly optimizes the skill evolution mechanism during testing. By extracting meta-skills from task execution trajectories, HiSME employs a hierarchical architecture to jointly refine both skills and their evolution policies, enabling cross-scenario adaptation without updating large model parameters. Experiments demonstrate that HiSME significantly enhances skill repertoire quality across multiple agent benchmarks and generates diverse, transferable meta-skills for various downstream tasks, effectively supporting continual experiential learning.
📝 Abstract
Test-time skill evolving is regarded as a new paradigm for enhancing deployed agentic systems. Existing works mainly focus on hard-coded skill evolving strategies or parametric learning that rely on expensive parameter updates in the underlying LLMs. In this paper, we demonstrate that test-time refinement of the skill evolving framework itself is necessary for continuous improvement of the agent systems in different downstream scenarios, and lightweight algorithmic adaptation is feasible. Specifically, we propose HiSME, a lightweight hierarchical skill meta-evolving solution that jointly optimizes skills and the skill evolving strategy by learning meta-skills from agents' task execution traces. Experiments on diverse agentic benchmarks show that meta-evolving can produce a higher-quality skill library than pure skill evolving and can derive diverse meta-skills for different scenarios, thereby facilitating future continual experience learning. Our code is temporarily public at https://anonymous.4open.science/r/HiSME-BD45.