SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

📅 2026-02-09

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the challenge that large language model (LLM) agents struggle to learn efficiently from raw experience in complex tasks, as existing memory mechanisms often store redundant and noisy trajectories without extracting reusable high-level behavioral abstractions. To overcome this limitation, the authors propose SkillBank, a framework that automatically constructs a hierarchical skill library through experience distillation and introduces both general and task-adaptive skill retrieval strategies. This enables recursive co-evolution between skills and policies, significantly reducing token consumption while enhancing reasoning efficiency and generalization. Evaluated on ALFWorld, WebShop, and seven search-augmented tasks, SkillBank achieves state-of-the-art performance, outperforming strong baselines by an average of 15.3% and maintaining robustness as task complexity increases.

Technology Category

Application Category

📝 Abstract

Large Language Model (LLM) agents have shown stunning results in complex tasks, yet they often operate in isolation, failing to learn from past experiences. Existing memory-based methods primarily store raw trajectories, which are often redundant and noise-heavy. This prevents agents from extracting high-level, reusable behavioral patterns that are essential for generalization. In this paper, we propose SkillRL, a framework that bridges the gap between raw experience and policy improvement through automatic skill discovery and recursive evolution. Our approach introduces an experience-based distillation mechanism to build a hierarchical skill library SkillBank, an adaptive retrieval strategy for general and task-specific heuristics, and a recursive evolution mechanism that allows the skill library to co-evolve with the agent's policy during reinforcement learning. These innovations significantly reduce the token footprint while enhancing reasoning utility. Experimental results on ALFWorld, WebShop and seven search-augmented tasks demonstrate that SkillRL achieves state-of-the-art performance, outperforming strong baselines over 15.3% and maintaining robustness as task complexity increases. Code is available at this https://github.com/aiming-lab/SkillRL.

Problem

Research questions and friction points this paper is trying to address.

LLM agents

experience learning

skill generalization

memory redundancy

behavioral patterns

Innovation

Methods, ideas, or system contributions that make the work stand out.

SkillRL

skill discovery

recursive evolution