🤖 AI Summary
In goal-conditioned reinforcement learning (GCRL), goal representations are vulnerable to exogenous noise and struggle to simultaneously ensure dynamical consistency and policy sufficiency. To address this, we propose a dual-goal representation framework that models intrinsic dynamics via full-state temporal distances, yielding goal encodings with representation invariance and noise robustness. Crucially, our approach pioneers the use of temporal distance as the core goal representation, theoretically guaranteeing recoverability of optimal policies and naturally supporting offline GCRL. By unifying self-supervised representation learning with explicit environment dynamics modeling, the method is plug-and-play compatible with any GCRL algorithm. Extensive evaluation across 20 state- and pixel-based tasks from OGBench demonstrates consistent and significant improvements in offline goal-reaching performance, validating its generalizability, effectiveness, and interpretability.
📝 Abstract
In this work, we introduce dual goal representations for goal-conditioned reinforcement learning (GCRL). A dual goal representation characterizes a state by "the set of temporal distances from all other states"; in other words, it encodes a state through its relations to every other state, measured by temporal distance. This representation provides several appealing theoretical properties. First, it depends only on the intrinsic dynamics of the environment and is invariant to the original state representation. Second, it contains provably sufficient information to recover an optimal goal-reaching policy, while being able to filter out exogenous noise. Based on this concept, we develop a practical goal representation learning method that can be combined with any existing GCRL algorithm. Through diverse experiments on the OGBench task suite, we empirically show that dual goal representations consistently improve offline goal-reaching performance across 20 state- and pixel-based tasks.