🤖 AI Summary
Real-world deployment of intelligent agents is often hindered by scarce training data and the difficulty of constructing high-fidelity simulation environments. To address this, we propose IMAC, the first framework to integrate Unsupervised Environment Design (UED) into world-model-based imagined environments for adaptive, curriculum-driven training from offline data. IMAC synergistically combines world models, UED, and reinforcement learning to conduct progressive, curriculum-style imagination training within procedurally generated latent spaces. Experiments demonstrate that even a lightweight world model trained solely on narrow-domain offline data enables strong zero-shot transfer to unseen environments—validating the feasibility of leveraging compact world models for generalizable agent training. Our core contribution lies in pioneering the joint use of UED and world models for automated curriculum generation, significantly enhancing cross-environment generalization performance.
📝 Abstract
Training agents to act in embodied environments typically requires vast training data or access to accurate simulation, neither of which exists for many cases in the real world. Instead, world models are emerging as an alternative leveraging offline, passively collected data, they make it possible to generate diverse worlds for training agents in simulation. In this work, we harness world models to generate imagined environments to train robust agents capable of generalizing to novel task variations. One of the challenges in doing this is ensuring the agent trains on useful generated data. We thus propose a novel approach, IMAC (Imagined Autocurricula), leveraging Unsupervised Environment Design (UED), which induces an automatic curriculum over generated worlds. In a series of challenging, procedurally generated environments, we show it is possible to achieve strong transfer performance on held-out environments, having trained only inside a world model learned from a narrower dataset. We believe this opens the path to utilizing larger-scale, foundation world models for generally capable agents.