Leveraging Human Feedback for Semantically-Relevant Skill Discovery

📅 2026-04-27

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

This work addresses the challenge that unsupervised skill discovery often yields unsafe or human-irrelevant behaviors, while existing preference-based feedback methods suffer from low efficiency and struggle to handle diverse skill spaces. To overcome these limitations, the authors propose a semantic annotation–based human feedback mechanism that leverages human semantic understanding of behaviors to efficiently label agent trajectories and subsequently learn a reward function. This approach guides the discovery of skills that are semantically meaningful, diverse, and relevant to human intent. By integrating semantic annotation, intrinsic motivation, and reward modeling, the method demonstrates strong empirical performance across a 2D navigation task and four locomotion control environments. It significantly improves feedback efficiency and, for the first time, explicitly incorporates semantic diversity and relevance into the skill discovery objective, enabling effective scaling to large behavioral spaces.

Technology Category

Application Category

📝 Abstract

Unsupervised skill discovery in reinforcement learning aims to intrinsically motivate agents to discover diverse and useful behaviours. However, unconstrained approaches can produce unsafe, unethical, or misaligned behaviours. To mitigate these risks and improve the practical desireability of discovered skills, recent work grounds the discovery process by leveraging human preference feedback. However, preference-based approaches are feedback-inefficient and inherently ill-equipped to deal with skill spaces composed of a variety of different skills such as running, jumping, walking, etc. To overcome this limitation, we introduce semantic labelling, a novel and feedback-efficient approach that leverages human cognitive strengths to identify and label semantically meaningful behaviours. Based on semantic labelling, we propose Semantically Relevant Skill Discovery (SRSD), a novel human-in-the-loop approach that collects semantic labels from human feedback and learns a reward function to encourage skills to be more semantically diverse and relevant. Through our experiments in a 2D navigation environment and four locomotion environments, we demonstrate that SRSD can improve semantic diversity and discover relevant behaviours while scaling effectively to a large variety of behaviours.

Problem

Research questions and friction points this paper is trying to address.

unsupervised skill discovery

human feedback

semantic diversity

reinforcement learning

skill alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic labelling

human-in-the-loop

skill discovery