🤖 AI Summary
This work addresses the challenge of compositional generalization in offline goal-conditioned reinforcement learning—specifically, how to effectively generalize to unseen combinations of goals and contextual conditions. To this end, the authors propose a novel analogical representation method tailored for analogical transduction, which disentangles task-varying elements from context-invariant structures during policy execution. By leveraging this representation within an offline reinforcement learning framework and integrating it with analogical reasoning, the approach synthesizes new policies capable of achieving novel goals through analogical transduction. Evaluated on the OGBench manipulation benchmark, the method significantly outperforms existing baselines, demonstrating both its effectiveness and novelty in enabling compositional generalization.
📝 Abstract
Compositional generalization is essential for reaching unseen goals under novel contextual variations in offline goal-conditioned reinforcement learning (GCRL), where a generalist goal-reaching agent must be learned from limited data. Most prior approaches pursue this via trajectory stitching over temporally contiguous segments, which limits composing behaviors across varying contexts. To overcome this limitation, we formalize analogy transduction as synthesizing new plans by composing task-endogenous analogies with given contexts and propose a novel analogy representation tailored for it. Grounded in our theory, this analogy representation captures what changes under optimal task execution, remains invariant to contextual variations, and is sufficient for optimal goal reaching. We further contend that generalization to unseen analogy-context pairs is a practical obstacle in analogy transduction, and introduce a new approach for offline GCRL that enables analogy transduction beyond seen pairs to unseen combinations. We empirically demonstrate the effectiveness of our approach on OGBench manipulation environments, substantially outperforming prior methods that do not perform analogy transduction. Project page: https://rllab-snu.github.io/projects/CTA/