๐ค AI Summary
This work addresses the challenge of reusing skills across tasks in offline hierarchical reinforcement learning by proposing a reusable skill abstraction mechanism grounded in local dynamics similarity. Building upon the HIQL framework, the method employs contrastive action representation learning (CARL) to automatically cluster semantically meaningful low-level skills and aligns action sequences across tasks to identify their applicable contexts. By capturing fine-grained dynamic similarities, the approach substantially enhances the high-level policyโs ability to effectively schedule low-level skills. Empirical evaluations demonstrate significant performance gains on downstream tasks in complex humanoid environments and the OGBench benchmark, highlighting the methodโs capacity to improve cross-task skill transfer in offline settings.
๐ Abstract
Hierarchical Reinforcement Learning (HRL) promises to solve long-horizon Reinforcement Learning (RL) tasks more efficiently than non-hierarchical counterparts by discovering and reusing temporally-extended skills. However, obtaining skills that are actually reusable remains an open challenge. Towards this end, we focus on abstractions that exploit the intuition of local dynamics: local transitions in different global contexts require similar kinds of action sequences. By aligning these contexts with the action sequences they require, we are able to learn which skills to reuse and where to reuse them. In principle, this information should benefit many HRL algorithms, where high-level policies have to reason about the low-level skills they use. The resulting algorithm CARL (Contrastive Action-based Representations for Reusable Local Control) shows both qualitative clustering of meaningful skills in complex humanoid environments and improved downstream performance on the OGBench benchmark when integrated with HIQL.