SUSD: Structured Unsupervised Skill Discovery through State Factorization

📅 2026-02-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work proposes a structured skill discovery framework based on state-space factorization to address the challenge of learning dynamic, diverse skills that cover all controllable factors in complex environments. The approach decomposes the state into independent factors—such as objects or entities—and assigns each a dedicated skill variable. By maximizing mutual information between states and skills while simultaneously maximizing distances among skill representations, the method achieves disentangled, fine-grained skill encodings. A dynamic exploration scheduling mechanism is further introduced to adaptively guide the agent toward under-explored factors. Experiments across three environments containing 1 to 10 factors demonstrate that the proposed method significantly outperforms existing approaches, not only discovering a rich repertoire of unsupervised skills but also efficiently facilitating downstream compositional task training.

Technology Category

Application Category

📝 Abstract
Unsupervised Skill Discovery (USD) aims to autonomously learn a diverse set of skills without relying on extrinsic rewards. One of the most common USD approaches is to maximize the Mutual Information (MI) between skill latent variables and states. However, MI-based methods tend to favor simple, static skills due to their invariance properties, limiting the discovery of dynamic, task-relevant behaviors. Distance-Maximizing Skill Discovery (DSD) promotes more dynamic skills by leveraging state-space distances, yet still fall short in encouraging comprehensive skill sets that engage all controllable factors or entities in the environment. In this work, we introduce SUSD, a novel framework that harnesses the compositional structure of environments by factorizing the state space into independent components (e.g., objects or controllable entities). SUSD allocates distinct skill variables to different factors, enabling more fine-grained control on the skill discovery process. A dynamic model also tracks learning across factors, adaptively steering the agent's focus toward underexplored factors. This structured approach not only promotes the discovery of richer and more diverse skills, but also yields a factorized skill representation that enables fine-grained and disentangled control over individual entities which facilitates efficient training of compositional downstream tasks via Hierarchical Reinforcement Learning (HRL). Our experimental results across three environments, with factors ranging from 1 to 10, demonstrate that our method can discover diverse and complex skills without supervision, significantly outperforming existing unsupervised skill discovery methods in factorized and complex environments. Code is publicly available at: https://github.com/hadi-hosseini/SUSD.
Problem

Research questions and friction points this paper is trying to address.

Unsupervised Skill Discovery
Mutual Information
State Factorization
Dynamic Skills
Controllable Factors
Innovation

Methods, ideas, or system contributions that make the work stand out.

Structured Unsupervised Skill Discovery
State Factorization
Factorized Skill Representation
Hierarchical Reinforcement Learning
Mutual Information Maximization
🔎 Similar Papers
No similar papers found.
S
Seyed Mohammad Hadi Hosseini
Department of Computer Engineering, Sharif University of Technology
Mahdieh Soleymani Baghshah
Mahdieh Soleymani Baghshah
Associate Professor, Computer Engineering Department, Sharif University of Technology
Deep LearningMachine Learning