Periodic Skill Discovery

📅 2025-11-05

📈 Citations: 0

✨ Influential: 0

career value

233K/year

🤖 AI Summary

Existing unsupervised skill discovery methods overlook the inherent periodicity of skills, limiting their applicability to robot locomotion and other tasks requiring multi-scale periodic behaviors. To address this, we propose Periodic Skill Discovery (PSD), the first framework to explicitly model periodicity in unsupervised skill learning. PSD employs a circular latent space constructed via an autoencoder, where angular coordinates encode temporal distances between states; it further integrates contrastive learning with a novel periodic consistency loss to learn temporal dynamics directly from raw pixel inputs. Crucially, PSD operates without external rewards and autonomously discovers diverse, interpretable periodic skills. Experiments demonstrate that PSD successfully generates multi-periodic behaviors in complex robotic control tasks and significantly improves downstream performance—e.g., achieving state-of-the-art results on hurdle-crossing. Moreover, PSD is compatible with existing skill-learning pipelines, enabling scalable extension of behavioral diversity.

Technology Category

Application Category

📝 Abstract

Unsupervised skill discovery in reinforcement learning (RL) aims to learn diverse behaviors without relying on external rewards. However, current methods often overlook the periodic nature of learned skills, focusing instead on increasing the mutual dependence between states and skills or maximizing the distance traveled in latent space. Considering that many robotic tasks - particularly those involving locomotion - require periodic behaviors across varying timescales, the ability to discover diverse periodic skills is essential. Motivated by this, we propose Periodic Skill Discovery (PSD), a framework that discovers periodic behaviors in an unsupervised manner. The key idea of PSD is to train an encoder that maps states to a circular latent space, thereby naturally encoding periodicity in the latent representation. By capturing temporal distance, PSD can effectively learn skills with diverse periods in complex robotic tasks, even with pixel-based observations. We further show that these learned skills achieve high performance on downstream tasks such as hurdling. Moreover, integrating PSD with an existing skill discovery method offers more diverse behaviors, thus broadening the agent's repertoire. Our code and demos are available at https://jonghaepark.github.io/psd/

Problem

Research questions and friction points this paper is trying to address.

Discovering diverse periodic behaviors without external rewards in reinforcement learning

Overcoming limitations of current methods ignoring temporal periodicity in learned skills

Learning robotic locomotion skills with varying timescales from pixel observations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Trains encoder mapping states to circular latent space

Captures temporal distance to learn diverse periodic skills

Integrates with existing methods for broader behavioral repertoire

🔎 Similar Papers

No similar papers found.

Bosch Group

Renningen, BW, DE

Senior Robotics Engineer- Spot Manipulation

Boston Dynamics

The base pay range for this position is between $155,000 to $220,000 annually. Base pay will depend on multiple individualized factors including, but not limited to internal equity, job related knowledge, skills and experience. This range represents a good faith estimate of compensation at the time of posting. Boston Dynamics offers a generous Benefits package including medical, dental vision, 401(k), paid time off and a annual bonus structure. Additional details regarding these benefit plans will be provided if an employee receives an offer for employment.

Waltham, MA

AI Research Scientist, Robotics