Contextual Latent World Models for Offline Meta Reinforcement Learning

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Offline meta-reinforcement learning struggles to learn effective representations of task dynamics due to the absence of supervisory signals, which limits policy generalization to unseen tasks. This work proposes a Contextual Latent World Model that, for the first time, introduces a task-conditional temporal consistency constraint into this setting. By jointly training a context encoder with a latent world model, the approach enables task representations not only to distinguish between tasks but also to capture their underlying dynamical differences. Empirical results demonstrate that this method significantly improves cross-task generalization performance of policies across multiple benchmarks, including MuJoCo, Contextual DeepMind Control Suite, and Meta-World.

Technology Category

Application Category

📝 Abstract
Offline meta-reinforcement learning seeks to learn policies that generalize across related tasks from fixed datasets. Context-based methods infer a task representation from transition histories, but learning effective task representations without supervision remains a challenge. In parallel, latent world models have demonstrated strong self-supervised representation learning through temporal consistency. We introduce contextual latent world models, which condition latent world models on inferred task representations and train them jointly with the context encoder. This enforces task-conditioned temporal consistency, yielding task representations that capture task-dependent dynamics rather than merely discriminating between tasks. Our method learns more expressive task representations and significantly improves generalization to unseen tasks across MuJoCo, Contextual-DeepMind Control, and Meta-World benchmarks.
Problem

Research questions and friction points this paper is trying to address.

offline meta-reinforcement learning
task representation
latent world models
context-based methods
generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual Latent World Models
Offline Meta-Reinforcement Learning
Task Representation
Temporal Consistency
Self-Supervised Learning
🔎 Similar Papers
No similar papers found.