Generating the Past, Present and Future from a Motion-Blurred Image

๐Ÿ“… 2025-12-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses the problem of temporal dynamic reconstruction from a single motion-blurred image. We propose the first cross-temporal (past/present/future) joint generation framework leveraging an internet-scale pre-trained video diffusion model. Our method eschews handcrafted priors, instead implicitly modeling blur kernels and explicitly enforcing spatiotemporal consistency constraints to transfer video diffusion priors into the blur domainโ€”enabling multi-frame-consistent restoration without paired training data. Unlike conventional deblurring or unidirectional prediction approaches, ours is the first to directly synthesize physically plausible, temporally coherent video sequences spanning multiple time instants from a single blurred input. Extensive evaluation on real-world complex motion-blurred images demonstrates significant improvements over state-of-the-art methods. Moreover, the reconstructed sequences effectively support downstream tasks including camera trajectory estimation, object motion analysis, and dynamic 3D reconstruction.

Technology Category

Application Category

๐Ÿ“ Abstract
We seek to answer the question: what can a motion-blurred image reveal about a scene's past, present, and future? Although motion blur obscures image details and degrades visual quality, it also encodes information about scene and camera motion during an exposure. Previous techniques leverage this information to estimate a sharp image from an input blurry one, or to predict a sequence of video frames showing what might have occurred at the moment of image capture. However, they rely on handcrafted priors or network architectures to resolve ambiguities in this inverse problem, and do not incorporate image and video priors on large-scale datasets. As such, existing methods struggle to reproduce complex scene dynamics and do not attempt to recover what occurred before or after an image was taken. Here, we introduce a new technique that repurposes a pre-trained video diffusion model trained on internet-scale datasets to recover videos revealing complex scene dynamics during the moment of capture and what might have occurred immediately into the past or future. Our approach is robust and versatile; it outperforms previous methods for this task, generalizes to challenging in-the-wild images, and supports downstream tasks such as recovering camera trajectories, object motion, and dynamic 3D scene structure. Code and data are available at https://blur2vid.github.io
Problem

Research questions and friction points this paper is trying to address.

Recovering complex scene dynamics from motion-blurred images
Predicting past and future events beyond the captured moment
Leveraging large-scale video priors for robust inverse problem solving
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses pre-trained video diffusion model
Recovers past, present, future scene dynamics
Generalizes to challenging real-world images
S
SaiKiran Tedla
York University, Canada
K
Kelly Zhu
University of Toronto, Canada and Vector Institute, Canada
T
Trevor Canham
York University, Canada
F
Felix Taubner
University of Toronto, Canada and Vector Institute, Canada
Michael S. Brown
Michael S. Brown
Vice President, Samsung's AI Center (Toronto); Professor and Canada Research Chair, York University
Computer VisionImage ProcessingColor Science
K
Kiriakos N. Kutulakos
University of Toronto, Canada and Vector Institute, Canada
D
David B. Lindell
University of Toronto, Canada and Vector Institute, Canada