🤖 AI Summary
Offline multi-task reinforcement learning faces challenges in cross-task knowledge sharing. To address this, we propose a goal-oriented skill abstraction framework: (1) discrete, reusable skills are extracted from mixed-task offline datasets via vector quantization; (2) a skill enhancement mechanism is introduced to improve skill discriminability and robustness; and (3) a hierarchical policy—comprising a high-level scheduler and low-level executor—is constructed to enable task-adaptive skill selection and transfer. Crucially, the method operates entirely offline, requiring no online interaction. Evaluated on the MetaWorld benchmark, it achieves significant improvements in multi-task average performance (+23.6%) and zero-shot generalization across unseen tasks. These results empirically validate the effectiveness and generality of skill abstraction in offline multi-task RL settings.
📝 Abstract
Offline multi-task reinforcement learning aims to learn a unified policy capable of solving multiple tasks using only pre-collected task-mixed datasets, without requiring any online interaction with the environment. However, it faces significant challenges in effectively sharing knowledge across tasks. Inspired by the efficient knowledge abstraction observed in human learning, we propose Goal-Oriented Skill Abstraction (GO-Skill), a novel approach designed to extract and utilize reusable skills to enhance knowledge transfer and task performance. Our approach uncovers reusable skills through a goal-oriented skill extraction process and leverages vector quantization to construct a discrete skill library. To mitigate class imbalances between broadly applicable and task-specific skills, we introduce a skill enhancement phase to refine the extracted skills. Furthermore, we integrate these skills using hierarchical policy learning, enabling the construction of a high-level policy that dynamically orchestrates discrete skills to accomplish specific tasks. Extensive experiments on diverse robotic manipulation tasks within the MetaWorld benchmark demonstrate the effectiveness and versatility of GO-Skill.