🤖 AI Summary
This work proposes a structured skill discovery framework based on state-space factorization to address the challenge of learning dynamic, diverse skills that cover all controllable factors in complex environments. The approach decomposes the state into independent factors—such as objects or entities—and assigns each a dedicated skill variable. By maximizing mutual information between states and skills while simultaneously maximizing distances among skill representations, the method achieves disentangled, fine-grained skill encodings. A dynamic exploration scheduling mechanism is further introduced to adaptively guide the agent toward under-explored factors. Experiments across three environments containing 1 to 10 factors demonstrate that the proposed method significantly outperforms existing approaches, not only discovering a rich repertoire of unsupervised skills but also efficiently facilitating downstream compositional task training.
📝 Abstract
Unsupervised Skill Discovery (USD) aims to autonomously learn a diverse set of skills without relying on extrinsic rewards. One of the most common USD approaches is to maximize the Mutual Information (MI) between skill latent variables and states. However, MI-based methods tend to favor simple, static skills due to their invariance properties, limiting the discovery of dynamic, task-relevant behaviors. Distance-Maximizing Skill Discovery (DSD) promotes more dynamic skills by leveraging state-space distances, yet still fall short in encouraging comprehensive skill sets that engage all controllable factors or entities in the environment. In this work, we introduce SUSD, a novel framework that harnesses the compositional structure of environments by factorizing the state space into independent components (e.g., objects or controllable entities). SUSD allocates distinct skill variables to different factors, enabling more fine-grained control on the skill discovery process. A dynamic model also tracks learning across factors, adaptively steering the agent's focus toward underexplored factors. This structured approach not only promotes the discovery of richer and more diverse skills, but also yields a factorized skill representation that enables fine-grained and disentangled control over individual entities which facilitates efficient training of compositional downstream tasks via Hierarchical Reinforcement Learning (HRL). Our experimental results across three environments, with factors ranging from 1 to 10, demonstrate that our method can discover diverse and complex skills without supervision, significantly outperforming existing unsupervised skill discovery methods in factorized and complex environments. Code is publicly available at: https://github.com/hadi-hosseini/SUSD.