Envisioning the Future, One Step at a Time

πŸ“… 2026-04-10
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

208K/year
πŸ€– AI Summary
This work addresses the limitations of existing methods that overly rely on dense appearance representations, which struggle to efficiently model long-horizon, multimodal sparse motion trajectories in complex scenes. The authors propose a dynamics-centric sparse point trajectory representation and formulate its evolution as an autoregressive diffusion process, enabling stepwise local predictions that explicitly capture the temporal accumulation of uncertainty. Their approach generates thousands of diverse, physically plausible future trajectories from a single input image and introduces a new benchmark, OWM, for evaluating predictive distributions in open-world settings. Experiments demonstrate that the method matches or exceeds the prediction accuracy of current dense simulators while achieving sampling speeds several orders of magnitude faster, thereby enabling scalable and practical open-set future scene prediction.

Technology Category

Application Category

πŸ“ Abstract
Accurately anticipating how complex, diverse scenes will evolve requires models that represent uncertainty, simulate along extended interaction chains, and efficiently explore many plausible futures. Yet most existing approaches rely on dense video or latent-space prediction, expending substantial capacity on dense appearance rather than on the underlying sparse trajectories of points in the scene. This makes large-scale exploration of future hypotheses costly and limits performance when long-horizon, multi-modal motion is essential. We address this by formulating the prediction of open-set future scene dynamics as step-wise inference over sparse point trajectories. Our autoregressive diffusion model advances these trajectories through short, locally predictable transitions, explicitly modeling the growth of uncertainty over time. This dynamics-centric representation enables fast rollout of thousands of diverse futures from a single image, optionally guided by initial constraints on motion, while maintaining physical plausibility and long-range coherence. We further introduce OWM, a benchmark for open-set motion prediction based on diverse in-the-wild videos, to evaluate accuracy and variability of predicted trajectory distributions under real-world uncertainty. Our method matches or surpasses dense simulators in predictive accuracy while achieving orders-of-magnitude higher sampling speed, making open-set future prediction both scalable and practical. Project page: http://compvis.github.io/myriad.
Problem

Research questions and friction points this paper is trying to address.

future prediction
sparse trajectories
open-set motion
long-horizon dynamics
uncertainty modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

sparse point trajectories
autoregressive diffusion model
open-set motion prediction
uncertainty modeling
future scene dynamics