Task Adaptation from Skills: Information Geometry, Disentanglement, and New Objectives for Unsupervised Reinforcement Learning

📅 2025-06-12

🏛️ International Conference on Learning Representations

📈 Citations: 7

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Unsupervised reinforcement learning (URL) aims to acquire transferable skills for unknown downstream tasks, yet existing mutual information-based skill learning (MISL) lacks theoretical characterization of skill transferability. We identify that skill diversity and separability are essential for efficient downstream policy initialization—properties not guaranteed by MISL. To address this, we propose two novel objectives—WSEP and PWSEP—grounded in Wasserstein geometry, along with a decoupling-aware metric, LSEPIN. Crucially, we establish the first theoretical link between Wasserstein distance and downstream adaptation cost, rigorously proving that our framework ensures complete discovery of optimal initial policies. Experiments demonstrate significant improvements in zero-shot transfer performance across multiple benchmarks, consistently outperforming MISL. Our method yields more disentangled skill representations and superior policy pretraining, enabling more effective downstream adaptation.

Technology Category

Application Category

📝 Abstract

Unsupervised reinforcement learning (URL) aims to learn general skills for unseen downstream tasks. Mutual Information Skill Learning (MISL) addresses URL by maximizing the mutual information between states and skills but lacks sufficient theoretical analysis, e.g., how well its learned skills can initialize a downstream task's policy. Our new theoretical analysis in this paper shows that the diversity and separability of learned skills are fundamentally critical to downstream task adaptation but MISL does not necessarily guarantee these properties. To complement MISL, we propose a novel disentanglement metric LSEPIN. Moreover, we build an information-geometric connection between LSEPIN and downstream task adaptation cost. For better geometric properties, we investigate a new strategy that replaces the KL divergence in information geometry with Wasserstein distance. We extend the geometric analysis to it, which leads to a novel skill-learning objective WSEP. It is theoretically justified to be helpful to downstream task adaptation and it is capable of discovering more initial policies for downstream tasks than MISL. We finally propose another Wasserstein distance-based algorithm PWSEP that can theoretically discover all optimal initial policies.

Problem

Research questions and friction points this paper is trying to address.

Analyzing skill diversity and separability in unsupervised reinforcement learning

Proposing a disentanglement metric LSEPIN for skill learning

Developing Wasserstein distance-based objectives for better task adaptation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes disentanglement metric LSEPIN for skill learning

Replaces KL divergence with Wasserstein distance WSEP

Introduces PWSEP algorithm for optimal initial policies

🔎 Similar Papers

No similar papers found.