Mem4D: Decoupling Static and Dynamic Memory for Dynamic Scene Reconstruction

📅 2025-08-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Monocular dense geometric reconstruction of dynamic scenes faces a “memory demand dilemma”: static structures require long-term memory stability, whereas dynamic objects necessitate rapid updates and high-frequency detail preservation—existing methods struggle to balance both, leading to geometric drift and motion blur. This paper introduces Mem4D, the first framework featuring a dual-memory decoupling architecture: a Transient Dynamics Memory dedicated to capturing instantaneous motion details, and a Persistent Structure Memory ensuring long-term geometric consistency for static elements. Coupled with a neural rendering mechanism that employs temporal feature storage and selective querying, Mem4D enables synergistic optimization between the two memories. Evaluated on multiple benchmarks, Mem4D achieves state-of-the-art or leading performance, significantly mitigating drift and blur while maintaining efficient inference.

Technology Category

Application Category

📝 Abstract
Reconstructing dense geometry for dynamic scenes from a monocular video is a critical yet challenging task. Recent memory-based methods enable efficient online reconstruction, but they fundamentally suffer from a Memory Demand Dilemma: The memory representation faces an inherent conflict between the long-term stability required for static structures and the rapid, high-fidelity detail retention needed for dynamic motion. This conflict forces existing methods into a compromise, leading to either geometric drift in static structures or blurred, inaccurate reconstructions of dynamic objects. To address this dilemma, we propose Mem4D, a novel framework that decouples the modeling of static geometry and dynamic motion. Guided by this insight, we design a dual-memory architecture: 1) The Transient Dynamics Memory (TDM) focuses on capturing high-frequency motion details from recent frames, enabling accurate and fine-grained modeling of dynamic content; 2) The Persistent Structure Memory (PSM) compresses and preserves long-term spatial information, ensuring global consistency and drift-free reconstruction for static elements. By alternating queries to these specialized memories, Mem4D simultaneously maintains static geometry with global consistency and reconstructs dynamic elements with high fidelity. Experiments on challenging benchmarks demonstrate that our method achieves state-of-the-art or competitive performance while maintaining high efficiency. Codes will be publicly available.
Problem

Research questions and friction points this paper is trying to address.

Decoupling static and dynamic memory for scene reconstruction
Resolving memory conflict in dynamic scene modeling
Enhancing fidelity in static and dynamic reconstruction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Decouples static and dynamic memory modeling
Uses Transient Dynamics Memory for motion details
Employs Persistent Structure Memory for static consistency
🔎 Similar Papers
No similar papers found.
Xudong Cai
Xudong Cai
Renmin University of China
computer visioncamera localizationSLAM
S
Shuo Wang
Renmin University of China
P
Peng Wang
Renmin University of China
Y
Yongcai Wang
Renmin University of China
Z
Zhaoxin Fan
Beihang University
W
Wanting Li
Renmin University of China
T
Tianbao Zhang
Shanghai Jiao Tong University
J
Jianrong Tao
Zhejiang University
Yeying Jin
Yeying Jin
Tencent | National University of Singapore
Computer VisionAIGCGenAIMLLMVLM
D
Deying Li
Renmin University of China