V-DPM: 4D Video Reconstruction with Dynamic Point Maps

📅 2026-01-14

📈 Citations: 1

✨ Influential: 1

career value

190K/year

🤖 AI Summary

This work addresses the limitations of existing dynamic 3D reconstruction methods, which struggle to efficiently process multi-view videos, recover full-scene point motion, and often rely on post-processing optimization. To overcome these challenges, we propose V-DPM, a novel framework that extends Dynamic Point Maps (DPM) to video inputs for the first time. By integrating the VGGT architecture with temporal modeling and transfer learning, V-DPM preserves the capabilities of pre-trained static models while enabling end-to-end 4D reconstruction from minimal synthetic data. Our method directly predicts dynamic depth and full-scene 3D point trajectories without requiring multi-view post-optimization. Experiments demonstrate that V-DPM significantly outperforms current approaches—including P3—achieving state-of-the-art performance in both dynamic 3D and 4D reconstruction tasks.

Technology Category

Application Category

📝 Abstract

Powerful 3D representations such as DUSt3R invariant point maps, which encode 3D shape and camera parameters, have significantly advanced feed forward 3D reconstruction. While point maps assume static scenes, Dynamic Point Maps (DPMs) extend this concept to dynamic 3D content by additionally representing scene motion. However, existing DPMs are limited to image pairs and, like DUSt3R, require post processing via optimization when more than two views are involved. We argue that DPMs are more useful when applied to videos and introduce V-DPM to demonstrate this. First, we show how to formulate DPMs for video input in a way that maximizes representational power, facilitates neural prediction, and enables reuse of pretrained models. Second, we implement these ideas on top of VGGT, a recent and powerful 3D reconstructor. Although VGGT was trained on static scenes, we show that a modest amount of synthetic data is sufficient to adapt it into an effective V-DPM predictor. Our approach achieves state of the art performance in 3D and 4D reconstruction for dynamic scenes. In particular, unlike recent dynamic extensions of VGGT such as P3, DPMs recover not only dynamic depth but also the full 3D motion of every point in the scene.

Problem

Research questions and friction points this paper is trying to address.

Dynamic Point Maps

4D reconstruction

video reconstruction

dynamic scenes

3D motion

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Point Maps

4D Reconstruction

Video-based 3D Reconstruction