WorldCache: Accelerating World Models for Free via Heterogeneous Token Caching

📅 2026-03-06

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

Diffusion-based world models suffer from high computational costs due to iterative denoising, hindering their use in interactive applications and long-horizon rollouts. Existing caching methods struggle with efficiency in multimodal, non-uniform dynamic environments. This work proposes WorldCache, a novel framework that introduces, for the first time, a physics-inspired curvature-based predictability metric to guide heterogeneous token prediction. It further employs a Hermite-damped interpolator to handle chaotic regions and integrates an adaptive recomputation strategy driven by normalized drift signals, enabling training-free, non-uniform feature caching. Experiments demonstrate that WorldCache achieves up to 3.7× end-to-end speedup while preserving 98% of rollout fidelity, substantially enhancing practicality in resource-constrained settings.

Technology Category

Application Category

📝 Abstract

Diffusion-based world models have shown strong potential for unified world simulation, but the iterative denoising remains too costly for interactive use and long-horizon rollouts. While feature caching can accelerate inference without training, we find that policies designed for single-modal diffusion transfer poorly to world models due to two world-model-specific obstacles: \emph{token heterogeneity} from multi-modal coupling and spatial variation, and \emph{non-uniform temporal dynamics} where a small set of hard tokens drives error growth, making uniform skipping either unstable or overly conservative. We propose \textbf{WorldCache}, a caching framework tailored to diffusion world models. We introduce \textit{Curvature-guided Heterogeneous Token Prediction}, which uses a physics-grounded curvature score to estimate token predictability and applies a Hermite-guided damped predictor for chaotic tokens with abrupt direction changes. We also design \textit{Chaotic-prioritized Adaptive Skipping}, which accumulates a curvature-normalized, dimensionless drift signal and recomputes only when bottleneck tokens begin to drift. Experiments on diffusion world models show that WorldCache delivers up to \textbf{3.7$\times$} end-to-end speedups while maintaining \textbf{98\%} rollout quality, demonstrating the vast advantages and practicality of WorldCache in resource-constrained scenarios. Our code is released in https://github.com/FofGofx/WorldCache.

Problem

Research questions and friction points this paper is trying to address.

diffusion world models

token heterogeneity

non-uniform temporal dynamics

iterative denoising

feature caching

Innovation

Methods, ideas, or system contributions that make the work stand out.

WorldCache

diffusion world models

heterogeneous token caching