SmokeSVD: Smoke Reconstruction from A Single View via Progressive Novel View Synthesis and Refinement with Diffusion Models

📅 2025-07-16

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the ill-posedness of monocular dynamic smoke 3D reconstruction caused by severe viewpoint ambiguity, this paper proposes a physics-guided diffusion model framework. Methodologically, it integrates diffusion priors with differentiable Navier–Stokes modeling, incorporating divergence and curl constraints on the velocity field, explicit advection optimization, and progressive multi-view rendering to jointly optimize the density field, velocity field, and smoke source location from a single video input. Its key innovations include: (i) the first incorporation of physical constraints directly into the diffusion generation process, enabling end-to-end co-optimization of generative priors and fluid dynamics; and (ii) alleviating monocular information deficiency via differentiable view synthesis and iterative density refinement. On standard benchmarks, our method surpasses state-of-the-art approaches in density-field PSNR, velocity-field angular error, and smoke-source localization accuracy; qualitative results further demonstrate superior spatiotemporal coherence and realism.

Technology Category

Application Category

📝 Abstract

Reconstructing dynamic fluids from sparse views is a long-standing and challenging problem, due to the severe lack of 3D information from insufficient view coverage. While several pioneering approaches have attempted to address this issue using differentiable rendering or novel view synthesis, they are often limited by time-consuming optimization and refinement processes under ill-posed conditions. To tackle above challenges, we propose SmokeSVD, an efficient and effective framework to progressively generate and reconstruct dynamic smoke from a single video by integrating both the powerful generative capabilities from diffusion models and physically guided consistency optimization towards realistic appearance and dynamic evolution. Specifically, we first propose a physically guided side-view synthesizer based on diffusion models, which explicitly incorporates divergence and gradient guidance of velocity fields to generate visually realistic and spatio-temporally consistent side-view images frame by frame, significantly alleviating the ill-posedness of single-view reconstruction without imposing additional constraints. Subsequently, we determine a rough estimation of density field from the pair of front-view input and side-view synthetic image, and further refine 2D blurry novel-view images and 3D coarse-grained density field through an iterative process that progressively renders and enhances the images from increasing novel viewing angles, generating high-quality multi-view image sequences. Finally, we reconstruct and estimate the fine-grained density field, velocity field, and smoke source via differentiable advection by leveraging the Navier-Stokes equations. Extensive quantitative and qualitative experiments show that our approach achieves high-quality reconstruction and outperforms previous state-of-the-art techniques.

Problem

Research questions and friction points this paper is trying to address.

Reconstruct dynamic smoke from single video

Generate realistic side-view images using diffusion models

Refine 3D density and velocity fields via iterative optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion models for novel view synthesis

Incorporates physically guided velocity field optimization

Iteratively refines density fields via differentiable advection

🔎 Similar Papers

ViewFusion: Learning Composable Diffusion Models for Novel View Synthesis