Zero-shot World Models via Search in Memory

📅 2025-10-17

📈 Citations: 0

✨ Influential: 0

career value

228K/year

🤖 AI Summary

This paper proposes a training-free, zero-shot world model that approximates environment transition dynamics solely via similarity search and random representation, addressing the heavy reliance of conventional trained world models (e.g., PlaNet, Dreamer) on extensive interaction data. Methodologically, it abandons explicit parametric learning; instead, it retrieves semantically similar latent states directly from a memory bank and employs random projections for efficient dynamic approximation and image reconstruction. Its key contribution is the first integration of nonparametric similarity search into world model construction—eliminating gradient-based optimization and explicit latent space learning. Experiments across diverse vision-based RL environments with substantial visual disparities demonstrate that the model achieves single-step prediction quality comparable to PlaNet, superior long-horizon forecasting performance, and significantly improved sample efficiency in online learning.

Technology Category

Application Category

📝 Abstract

World Models have vastly permeated the field of Reinforcement Learning. Their ability to model the transition dynamics of an environment have greatly improved sample efficiency in online RL. Among them, the most notorious example is Dreamer, a model that learns to act in a diverse set of image-based environments. In this paper, we leverage similarity search and stochastic representations to approximate a world model without a training procedure. We establish a comparison with PlaNet, a well-established world model of the Dreamer family. We evaluate the models on the quality of latent reconstruction and on the perceived similarity of the reconstructed image, on both next-step and long horizon dynamics prediction. The results of our study demonstrate that a search-based world model is comparable to a training based one in both cases. Notably, our model show stronger performance in long-horizon prediction with respect to the baseline on a range of visually different environments.

Problem

Research questions and friction points this paper is trying to address.

Approximating world models without training using similarity search

Comparing search-based models with trained ones like PlaNet

Evaluating latent reconstruction and long-horizon prediction performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Search-based world model without training procedure

Uses similarity search and stochastic representations

Approximates transition dynamics via memory retrieval

🔎 Similar Papers

No similar papers found.