Uni-Skill: Building Self-Evolving Skill Repository for Generalizable Robotic Manipulation

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing skill-centric approaches, which rely on fixed skill repositories and struggle to adapt to new tasks without human intervention. To overcome this, we propose Uni-Skill, a framework that integrates skill-aware planning with an automatic skill evolution mechanism to proactively acquire new skills when current capabilities are insufficient. Leveraging an offline-constructed SkillFolder repository, Uni-Skill enables efficient skill retrieval and introduces, for the first time, a self-evolving skill library coupled with a VerbNet-inspired hierarchical skill ontology. This paradigm shifts skill acquisition from manual annotation to automated, structured extraction from large-scale unlabeled robotic videos, thereby supporting zero-shot generalization. Experiments demonstrate that Uni-Skill significantly enhances both cross-task zero-shot generalization and complex task reasoning performance in both simulated and real-world environments.

Technology Category

Application Category

📝 Abstract
While skill-centric approaches leverage foundation models to enhance generalization in compositional tasks, they often rely on fixed skill libraries, limiting adaptability to new tasks without manual intervention. To address this, we propose Uni-Skill, a Unified Skill-centric framework that supports skill-aware planning and facilitates automatic skill evolution. Unlike prior methods that restrict planning to predefined skills, Uni-Skill requests for new skill implementations when existing ones are insufficient, ensuring adaptable planning with self-augmented skill library. To support automatic implementation of diverse skills requested by the planning module, we construct SkillFolder, a VerbNet-inspired repository derived from large-scale unstructured robotic videos. SkillFolder introduces a hierarchical skill taxonomy that captures diverse skill descriptions at multiple levels of abstraction. By populating this taxonomy with large-scale, automatically annotated demonstrations, Uni-Skill shifts the paradigm of skill acquisition from inefficient manual annotation to efficient offline structural retrieval. Retrieved examples provide semantic supervision over behavior patterns and fine-grained references for spatial trajectories, enabling few-shot skill inference without deployment-time demonstrations. Comprehensive experiments in both simulation and real-world settings verify the state-of-the-art performance of Uni-Skill over existing VLM-based skill-centric approaches, highlighting its advanced reasoning capabilities and strong zero-shot generalization across a wide range of novel tasks.
Problem

Research questions and friction points this paper is trying to address.

robotic manipulation
skill generalization
fixed skill library
manual intervention
task adaptability
Innovation

Methods, ideas, or system contributions that make the work stand out.

self-evolving skill repository
skill-centric planning
hierarchical skill taxonomy
few-shot skill inference
zero-shot generalization
Senwei Xie
Senwei Xie
ict, cas
Embodied AI
Y
Yuntian Zhang
Key Laboratory of AI Safety of CAS, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China; University of Chinese Academy of Sciences, Beijing, 100049, China
Ruiping Wang
Ruiping Wang
Professor, Institute of Computing Technology, Chinese Academy of Sciences
Computer VisionPattern RecognitionMachine Learning
Xilin Chen
Xilin Chen
Institute of Computing Technology, Chinese Academy of Sciences
Computer VisionPattern RecognitionMachine Learning