RiGS: Rigid-aware 4D Gaussian Splatting from a Single Monocular Video

📅 2026-05-22
📈 Citations: 0
Influential: 0
📄 PDF

career value

231K/year
🤖 AI Summary
This work addresses the challenge of 4D reconstruction from monocular videos involving multiscale dynamics—specifically, long-term rigid motions and short-term complex deformations—by proposing RiGS, a novel method that partitions Gaussian primitives into static, rigid, and transient categories based on their motion characteristics. RiGS introduces a dynamic transition mechanism from rigid to transient states and leverages object-level dynamic masks together with scene flow guidance to effectively disentangle static and dynamic regions while integrating long-range spatiotemporal information. Built upon the 4D Gaussian Splatting framework, RiGS achieves high-quality, temporally consistent, and detail-rich reconstructions of dynamic scenes, setting state-of-the-art performance on novel view synthesis benchmarks.
📝 Abstract
Reconstructing dynamic 3D scenes from monocular videos is a fundamental yet highly challenging task, as real-world motions often involve both long-term smooth transformations and short-term complex deformations. Existing methods either struggle to maintain temporal consistency or fail to capture high-frequency dynamics due to limited motion modeling capacity. In this work, we present Rigid-aware 4D Gaussian Splatting (RiGS), which simultaneously captures motions across multiple temporal scales. Specifically, RiGS introduces three types of Gaussian primitives: static, rigid, and transient, which represent static backgrounds, long-term low-frequency motions, and short-term high-frequency dynamics, respectively. An object-wise dynamic mask is proposed to aggregate long-range spatiotemporal motion information and guide the decomposition of static and dynamic regions. To jointly model motion across scales, rigid Gaussians are allowed to transition into transient Gaussians based on their temporal duration, and both are optimized under scene flow guidance, providing dense 3D motion supervision. Extensive experiments demonstrate that RiGS achieves state-of-the-art performance on novel view synthesis benchmarks. Code is available at \hyperlink{https://github.com/ladvu/RiGS}{https://github.com/ladvu/RiGS}.
Problem

Research questions and friction points this paper is trying to address.

dynamic 3D reconstruction
monocular video
temporal consistency
high-frequency dynamics
motion modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

4D Gaussian Splatting
rigid-aware motion modeling
dynamic scene reconstruction
scene flow guidance
monocular video