CD-NGP: A Fast Scalable Continual Representation for Dynamic Scenes

📅 2024-09-08
🏛️ arXiv.org
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
Dynamic novel view synthesis (NVS) for dynamic scenes suffers from high memory overhead, poor model scalability, inefficient training, and rendering artifacts. To address these challenges, this work proposes the first continual learning framework tailored for dynamic scenes, introducing dynamic neural graphics primitives and a spatiotemporal-decoupled hash encoding fusion mechanism to enable parameter reuse and implicit representation optimization. Furthermore, we establish the first benchmark dataset featuring ultra-long multi-view videos with complex rigid and non-rigid motions. Our method reduces peak training memory by 85% (<14 GB) and compresses streaming bandwidth to <0.4 MB/frame. Quantitative and qualitative evaluations demonstrate substantial improvements in reconstruction fidelity and scalability over state-of-the-art methods.

Technology Category

Application Category

📝 Abstract
Current methods for novel view synthesis (NVS) in dynamic scenes encounter significant challenges in managing memory consumption, model complexity, training efficiency, and rendering fidelity. Existing offline techniques, while delivering high-quality results, face challenges from substantial memory demands and limited scalability. Conversely, online methods struggle to balance rapid convergence with model compactness. To address these issues, we propose continual dynamic neural graphics primitives (CD-NGP). Our approach leverages a continual learning framework to reduce memory overhead, and it also integrates features from distinct temporal and spatial hash encodings for high rendering quality. Meanwhile, our method employs parameter reuse to achieve high scalability. Additionally, we introduce a novel dataset featuring multi-view, exceptionally long video sequences with substantial rigid and non-rigid motion, which is seldom possessed by existing datasets. We evaluate the reconstruction quality, speed and scalability of our method on both the established public datasets and our exceptionally long video dataset. Notably, our method achieves an $85%$ reduction in training memory consumption (less than 14GB) compared to offline techniques and significantly lowers streaming bandwidth requirements (less than 0.4MB/frame) relative to other online alternatives. The experimental results on our long video sequences dataset show the superior scalability and reconstruction quality compared to existing state-of-the-art approaches.
Problem

Research questions and friction points this paper is trying to address.

Reducing memory consumption in dynamic scene synthesis
Improving rendering quality with spatial-temporal encoding
Enhancing scalability through continual learning framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter reuse reduces memory and enhances scalability
Spatial and temporal hash encodings improve rendering quality
New dataset with long-duration multi-view videos
🔎 Similar Papers
No similar papers found.
Zhenhuan Liu
Zhenhuan Liu
NVIDIA
S
Shuai Liu
Z
Zhiwei Ning
J
Jie Yang
W
Wei Liu
Shanghai Jiao Tong University