Equivariant Goal Conditioned Contrastive Reinforcement Learning

📅 2025-07-21

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses goal-conditioned manipulation tasks without reward annotations, proposing Equivariant Contrastive Reinforcement Learning (ECRL) to improve sample efficiency and spatial generalization. Methodologically, ECRL constructs a latent space constrained by group symmetry—specifically, rotational equivariance—by jointly modeling a rotation-invariant critic and an equivariant actor within a goal-conditioned group-invariant MDP framework. It further incorporates equivariance priors into contrastive representation learning, enabling structured, reward-free representation acquisition without manual reward engineering. The approach supports both state-based and image-based observations and is compatible with offline RL settings. Empirically, ECRL achieves substantial improvements over strong baselines across diverse simulated manipulation benchmarks, demonstrating superior sample efficiency, robust spatial generalization to unseen orientations/positions, and effective adaptation in offline learning scenarios.

Technology Category

Application Category

📝 Abstract

Contrastive Reinforcement Learning (CRL) provides a promising framework for extracting useful structured representations from unlabeled interactions. By pulling together state-action pairs and their corresponding future states, while pushing apart negative pairs, CRL enables learning nontrivial policies without manually designed rewards. In this work, we propose Equivariant CRL (ECRL), which further structures the latent space using equivariant constraints. By leveraging inherent symmetries in goal-conditioned manipulation tasks, our method improves both sample efficiency and spatial generalization. Specifically, we formally define Goal-Conditioned Group-Invariant MDPs to characterize rotation-symmetric robotic manipulation tasks, and build on this by introducing a novel rotation-invariant critic representation paired with a rotation-equivariant actor for Contrastive RL. Our approach consistently outperforms strong baselines across a range of simulated tasks in both state-based and image-based settings. Finally, we extend our method to the offline RL setting, demonstrating its effectiveness across multiple tasks.

Problem

Research questions and friction points this paper is trying to address.

Improves sample efficiency in goal-conditioned tasks

Enhances spatial generalization using equivariant constraints

Extends contrastive RL to offline settings effectively

Innovation

Methods, ideas, or system contributions that make the work stand out.

Equivariant constraints structure latent space

Rotation-invariant critic with equivariant actor

Extends to offline RL effectively

🔎 Similar Papers

Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from Offline Data