Remember to be Curious: Episodic Context and Persistent Worlds for 3D Exploration

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

232K/year
🤖 AI Summary
This work addresses the challenge of sparse rewards and long-horizon exploration in 3D environments, where existing curiosity-driven methods often fall into local loops or redundant exploration. The authors propose a novel approach that integrates an online 3D reconstruction-based world model with RGB sequence-based policy learning, introducing spatial persistence and episodic context into the curiosity mechanism for the first time. This enables the agent to distinguish genuinely novel regions and avoid revisiting uninformative areas. Relying solely on visual inputs, the method achieves zero-shot generalization from Habitat-Matterport 3D (HM3D) to both Gibson and AI-generated environments after pure curiosity-driven pre-training, significantly outperforming from-scratch baselines and demonstrating strong performance on downstream tasks such as apple picking and image-goal navigation.
📝 Abstract
Exploration is a prerequisite for learning useful behaviors in sparse-reward, long-horizon tasks, particularly within 3D environments. Curiosity-driven reinforcement learning addresses this via intrinsic rewards derived from the mismatch between the agent's predictive model of the world and reality. However, translating this intrinsic motivation to complex, photorealistic environments remains difficult, as agents can become trapped in local loops and receive fresh rewards for revisiting forgotten states. In this work, we demonstrate that this failure stems from a lack of spatial persistence and episodic context. We show that effective curiosity requires a model of the world that is persistent and continuously updated, paired with an agent that maintains an episodic trajectory history to navigate toward novel regions. We achieve this using an online 3D reconstruction as a persistent model of the world, while the agent policy is parameterized as a sequence model over RGB observations to maintain episodic context. This design enables effective exploration during training while allowing the agent to navigate using solely RGB frames at deployment. Trained purely via curiosity on HM3D, our agent outperforms RL-based active mapping baselines and generalizes zero-shot to Gibson and AI-generated worlds. Our end-to-end policy enables efficient adaptation to downstream tasks, such as apple picking and image-goal navigation, outperforming from-scratch baselines. Please see video results at https://recuriosity.github.io/.
Problem

Research questions and friction points this paper is trying to address.

exploration
curiosity-driven reinforcement learning
3D environments
sparse-reward tasks
episodic context
Innovation

Methods, ideas, or system contributions that make the work stand out.

curiosity-driven exploration
persistent 3D world model
episodic context
online 3D reconstruction
zero-shot generalization
🔎 Similar Papers