Horizon Imagination: Efficient On-Policy Training in Diffusion World Models

📅 2026-02-08

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Diffusion-based world models in reinforcement learning struggle to balance generation quality and control efficiency due to sequential imagination and high computational costs. To address this, this work proposes Horizon Imagination (HI), an online policy-aware imagination mechanism tailored for discrete stochastic policies that enables parallel denoising of multiple future observations, substantially improving both training and inference efficiency. HI incorporates a stabilization mechanism and a novel sampling schedule that decouples the denoising budget from the effective prediction horizon, enabling sub-frame-level computational allocation. Evaluated on the Atari 100K and Craftium benchmarks, HI maintains superior control performance using only half the denoising steps and achieves higher-quality future frame generation across diverse scheduling strategies.

Technology Category

Application Category

📝 Abstract

We study diffusion-based world models for reinforcement learning, which offer high generative fidelity but face critical efficiency challenges in control. Current methods either require heavyweight models at inference or rely on highly sequential imagination, both of which impose prohibitive computational costs. We propose Horizon Imagination (HI), an on-policy imagination process for discrete stochastic policies that denoises multiple future observations in parallel. HI incorporates a stabilization mechanism and a novel sampling schedule that decouples the denoising budget from the effective horizon over which denoising is applied while also supporting sub-frame budgets. Experiments on Atari 100K and Craftium show that our approach maintains control performance with a sub-frame budget of half the denoising steps and achieves superior generation quality under varied schedules. Code is available at https://github.com/leor-c/horizon-imagination.

Problem

Research questions and friction points this paper is trying to address.

diffusion world models

reinforcement learning

on-policy training

computational efficiency

sequential imagination

Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion world models

on-policy training

parallel denoising