Dynamic View Synthesis from Small Camera Motion Videos

📅 2025-06-29

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

Existing methods for dynamic 3D scene novel-view synthesis under small-camera-motion videos suffer from inaccurate geometric representation and biased camera parameter estimation. To address these limitations, we propose a joint optimization framework integrating Distributed Depth Regularization (DDR) and camera pose refinement within a NeRF-based pipeline. Specifically, we introduce Gumbel-softmax differentiable sampling to model the distribution of rendering weights, impose boundary-aware density constraints to enhance depth continuity, and develop an interactive visualization tool to guide geometric learning. Crucially, camera poses are treated as learnable parameters optimized end-to-end during training. Evaluated on multiple dynamic scene datasets, our method achieves significant improvements in reconstruction accuracy and novel-view synthesis quality—particularly under minimal camera motion—demonstrating superior geometric consistency and robustness. This work provides an effective solution for dynamic novel-view synthesis from low-motion video sequences.

Technology Category

Application Category

📝 Abstract

Novel view synthesis for dynamic $3$D scenes poses a significant challenge. Many notable efforts use NeRF-based approaches to address this task and yield impressive results. However, these methods rely heavily on sufficient motion parallax in the input images or videos. When the camera motion range becomes limited or even stationary (i.e., small camera motion), existing methods encounter two primary challenges: incorrect representation of scene geometry and inaccurate estimation of camera parameters. These challenges make prior methods struggle to produce satisfactory results or even become invalid. To address the first challenge, we propose a novel Distribution-based Depth Regularization (DDR) that ensures the rendering weight distribution to align with the true distribution. Specifically, unlike previous methods that use depth loss to calculate the error of the expectation, we calculate the expectation of the error by using Gumbel-softmax to differentiably sample points from discrete rendering weight distribution. Additionally, we introduce constraints that enforce the volume density of spatial points before the object boundary along the ray to be near zero, ensuring that our model learns the correct geometry of the scene. To demystify the DDR, we further propose a visualization tool that enables observing the scene geometry representation at the rendering weight level. For the second challenge, we incorporate camera parameter learning during training to enhance the robustness of our model to camera parameters. We conduct extensive experiments to demonstrate the effectiveness of our approach in representing scenes with small camera motion input, and our results compare favorably to state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Incorrect scene geometry representation in small camera motion

Inaccurate camera parameter estimation in limited motion range

Challenges in novel view synthesis for dynamic 3D scenes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Distribution-based Depth Regularization for geometry accuracy

Gumbel-softmax sampling for differentiable rendering weights

Joint camera parameter learning during model training

🔎 Similar Papers

Shape of Motion: 4D Reconstruction from a Single Video