ViSA: Visited-State Augmentation for Generalized Goal-Space Contrastive Reinforcement Learning

πŸ“… 2026-03-16
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
This work addresses the limited generalization of conventional contrastive reinforcement learning in goal-conditioned tasks, particularly to rare or hard-to-reach goal states, which stems from inaccuracies in value function estimation due to insufficient goal coverage. To overcome this, the authors propose ViSA, a method that leverages state data augmentation to generate diverse training samples and incorporates mutual information regularization alongside embedding space consistency constraints. This approach constructs a goal-sensitive yet structurally coherent representation space, substantially enhancing the policy’s ability to estimate values across a broad goal distribution. Empirical results demonstrate that ViSA achieves effective generalization and accurate evaluation for challenging goals in both simulated and real-world robotic tasks.

Technology Category

Application Category

πŸ“ Abstract
Goal-Conditioned Reinforcement Learning (GCRL) is a framework for learning a policy that can reach arbitrarily given goals. In particular, Contrastive Reinforcement Learning (CRL) provides a framework for policy updates using an approximation of the value function estimated via contrastive learning, achieving higher sample efficiency compared to conventional methods. However, since CRL treats the visited state as a pseudo-goal during learning, it can accurately estimate the value function only for limited goals. To address this issue, we propose a novel data augmentation approach for CRL called ViSA (Visited-State Augmentation). ViSA consists of two components: 1) generating augmented state samples, with the aim of augmenting hard-to-visit state samples during on-policy exploration, and 2) learning consistent embedding space, which uses an augmented state as auxiliary information to regularize the embedding space by reformulating the objective function of the embedding space based on mutual information. We evaluate ViSA in simulation and real-world robotic tasks and show improved goal-space generalization, which permits accurate value estimation for hard-to-visit goals. Further details can be found on the project page: \href{https://issa-n.github.io/projectPage_ViSA/}{\texttt{https://issa-n.github.io/projectPage\_ViSA/}}
Problem

Research questions and friction points this paper is trying to address.

Goal-Conditioned Reinforcement Learning
Contrastive Reinforcement Learning
Value Function Estimation
Goal-Space Generalization
Visited-State Augmentation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Visited-State Augmentation
Contrastive Reinforcement Learning
Goal-Conditioned Reinforcement Learning
Embedding Regularization
Mutual Information
πŸ”Ž Similar Papers
No similar papers found.
I
Issa Nakamura
Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Nara, Japan
T
Tomoya Yamanokuchi
Graduate School of Information Science, Nara Institute of Science and Technology (NAIST), Nara, Japan
Yuki Kadokawa
Yuki Kadokawa
Nara Institute of Science and Technology
Reinforcement LearningRoboticsSim-to-RealMachine Learning
J
Jia Qu
Advanced Technology R&D Center, Mitsubishi Electric Corporation, Hyogo, Japan
S
Shun Otsubo
Advanced Technology R&D Center, Mitsubishi Electric Corporation, Hyogo, Japan
K
Ken Miyamoto
Advanced Technology R&D Center, Mitsubishi Electric Corporation, Hyogo, Japan
Shotaro Miwa
Shotaro Miwa
Research Manager, Mitsubishi Electric Corp., Visiting Researcher, University of Alberta, Researcher
Computer VisionReinforcement Learning
Takamitsu Matsubara
Takamitsu Matsubara
Nara Institute of Science and Technology
Robot LearningMachine LearningReinforcement LearningRobotics