Goal-Oriented Skill Abstraction for Offline Multi-Task Reinforcement Learning

📅 2025-07-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Offline multi-task reinforcement learning faces challenges in cross-task knowledge sharing. To address this, we propose a goal-oriented skill abstraction framework: (1) discrete, reusable skills are extracted from mixed-task offline datasets via vector quantization; (2) a skill enhancement mechanism is introduced to improve skill discriminability and robustness; and (3) a hierarchical policy—comprising a high-level scheduler and low-level executor—is constructed to enable task-adaptive skill selection and transfer. Crucially, the method operates entirely offline, requiring no online interaction. Evaluated on the MetaWorld benchmark, it achieves significant improvements in multi-task average performance (+23.6%) and zero-shot generalization across unseen tasks. These results empirically validate the effectiveness and generality of skill abstraction in offline multi-task RL settings.

Technology Category

Application Category

📝 Abstract
Offline multi-task reinforcement learning aims to learn a unified policy capable of solving multiple tasks using only pre-collected task-mixed datasets, without requiring any online interaction with the environment. However, it faces significant challenges in effectively sharing knowledge across tasks. Inspired by the efficient knowledge abstraction observed in human learning, we propose Goal-Oriented Skill Abstraction (GO-Skill), a novel approach designed to extract and utilize reusable skills to enhance knowledge transfer and task performance. Our approach uncovers reusable skills through a goal-oriented skill extraction process and leverages vector quantization to construct a discrete skill library. To mitigate class imbalances between broadly applicable and task-specific skills, we introduce a skill enhancement phase to refine the extracted skills. Furthermore, we integrate these skills using hierarchical policy learning, enabling the construction of a high-level policy that dynamically orchestrates discrete skills to accomplish specific tasks. Extensive experiments on diverse robotic manipulation tasks within the MetaWorld benchmark demonstrate the effectiveness and versatility of GO-Skill.
Problem

Research questions and friction points this paper is trying to address.

Enhance knowledge transfer in offline multi-task reinforcement learning
Extract reusable skills for improved task performance
Address class imbalance between general and task-specific skills
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts reusable skills via goal-oriented process
Uses vector quantization for discrete skill library
Integrates skills with hierarchical policy learning
J
Jinmin He
C2DL, Institute of Automation, Chinese Academy of Sciences
K
Kai Li
School of Artificial Intelligence, University of Chinese Academy of Sciences
Y
Yifan Zang
School of Artificial Intelligence, University of Chinese Academy of Sciences
Haobo Fu
Haobo Fu
Tencent AI Lab, University of Birmingham
Reinforcement LearningEvolutionary Computation
Q
Qiang Fu
Tencent AI Lab
J
Junliang Xing
Tsinghua University
J
Jian Cheng
AiRiA