EmbodiSkill: Skill-Aware Reflection for Self-Evolving Embodied Agents

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Existing skill self-evolution methods struggle to distinguish whether task failures stem from errors in skill content or execution deviations, limiting their performance in embodied environments. This work proposes a training-free, skill-aware reflection framework that introduces, for the first time in embodied intelligence, a skill-execution decoupling mechanism: by leveraging a large language model to analyze task trajectories, it separately identifies skill deficiencies and execution errors, enabling targeted updates to the skill repository and refined execution guidance to support accumulative evolution of procedural knowledge. Evaluated on the ALFWorld benchmark, the method enables a frozen-parameter Qwen3.5-27B executor to achieve a 93.28% task success rate, substantially outperforming the skill-free GPT-5.2 by 31.58%.

📝 Abstract

Embodied agents can benefit from skills that guide object search, action execution, and state changes across diverse environments. Since embodied environments vary across layouts, object states, and other execution factors, these skills must self-evolve from trajectories generated during task execution. However, existing skill self-evolution methods are mainly developed in digital environments and often convert trajectories into coarse skill updates. Directly applying this paradigm to embodied settings is problematic, because a failed task execution may reflect not only incorrect skill content, but also an execution lapse in which the agent fails to follow valid guidance. We propose EmbodiSkill, a training-free framework for embodied skill self-evolution through skill-aware reflection and targeted revision. EmbodiSkill interprets each trajectory with respect to the current skill, uses skill-changing evidence to update the skill body, and uses execution-lapse evidence to preserve and emphasize valid guidance. Experiments on ALFWorld and EmbodiedBench show that EmbodiSkill consistently improves embodied task success. On ALFWorld, EmbodiSkill enables a frozen Qwen3.5-27B executor to reach 93.28% task success, outperforming GPT-5.2 used as a direct agent without skills by 31.58%. These results show that skill-aware self-evolution helps embodied agents accumulate reusable procedural knowledge from their own trajectories.

Problem

Research questions and friction points this paper is trying to address.

embodied agents

skill self-evolution

execution lapse

trajectory interpretation

procedural knowledge

Innovation

Methods, ideas, or system contributions that make the work stand out.

skill-aware reflection

self-evolving embodied agents

trajectory interpretation