LocalDyGS: Multi-view Global Dynamic Scene Modeling via Adaptive Local Implicit Feature Decoupling

📅 2025-07-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing dynamic neural rendering methods (e.g., NeRF, 3D Gaussian Splatting) struggle to jointly model large-scale motion and fine-grained dynamic details, leading to geometric distortions, temporal instability, and visual artifacts in synthesized videos of real-world scenes. To address this, we propose an adaptive local implicit feature disentanglement framework: learnable spatiotemporal seeds partition the scene into localized regions; static scene geometry and dynamic residual fields are explicitly decoupled; and the framework integrates local implicit feature decomposition, temporal-aware Gaussian generation, and static-dynamic co-optimization. Our approach is the first to achieve unified, high-fidelity modeling of both large-scale and fine-grained dynamics. It establishes new state-of-the-art performance on multiple fine-grained dynamic benchmarks and successfully scales to complex, large-scale real-world scenes—significantly improving realism, geometric consistency, and temporal stability in dynamic video synthesis.

Technology Category

Application Category

📝 Abstract
Due to the complex and highly dynamic motions in the real world, synthesizing dynamic videos from multi-view inputs for arbitrary viewpoints is challenging. Previous works based on neural radiance field or 3D Gaussian splatting are limited to modeling fine-scale motion, greatly restricting their application. In this paper, we introduce LocalDyGS, which consists of two parts to adapt our method to both large-scale and fine-scale motion scenes: 1) We decompose a complex dynamic scene into streamlined local spaces defined by seeds, enabling global modeling by capturing motion within each local space. 2) We decouple static and dynamic features for local space motion modeling. A static feature shared across time steps captures static information, while a dynamic residual field provides time-specific features. These are combined and decoded to generate Temporal Gaussians, modeling motion within each local space. As a result, we propose a novel dynamic scene reconstruction framework to model highly dynamic real-world scenes more realistically. Our method not only demonstrates competitive performance on various fine-scale datasets compared to state-of-the-art (SOTA) methods, but also represents the first attempt to model larger and more complex highly dynamic scenes. Project page: https://wujh2001.github.io/LocalDyGS/.
Problem

Research questions and friction points this paper is trying to address.

Synthesizing dynamic videos from multi-view inputs for arbitrary viewpoints
Modeling both large-scale and fine-scale motion in dynamic scenes
Decoupling static and dynamic features for accurate motion modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decomposes dynamic scenes into local spaces
Decouples static and dynamic features adaptively
Generates Temporal Gaussians for motion modeling
🔎 Similar Papers
No similar papers found.
Jiahao Wu
Jiahao Wu
The Chinese University of Hong Kong
Medical RobotsRobot-assisted MicrosurgeryMotion Planning
R
Rui Peng
Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University; Pengcheng Lab
Jianbo Jiao
Jianbo Jiao
University of Birmingham | University of Oxford
Computer VisionMachine Learning
Jiayu Yang
Jiayu Yang
The Australian National University
3D Computer Vision3D AIGC3D ReconstructionMulti-view StereoVR AR XR
L
Luyang Tang
Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University; Pengcheng Lab
K
Kaiqiang Xiong
Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University; Pengcheng Lab
J
Jie Liang
Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University
J
Jinbo Yan
Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University
R
Runling Liu
Guangdong Provincial Key Laboratory of Ultra High Definition Immersive Media Technology, Shenzhen Graduate School, Peking University
Ronggang Wang
Ronggang Wang
Shenzhen Graduate School, Peking University
Immersive Video Coding and Processing