FreeDriveRF: Monocular RGB Dynamic NeRF without Poses for Autonomous Driving via Point-Level Dynamic-Static Decoupling

📅 2025-05-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenging problem of dynamic scene reconstruction using only monocular RGB video—without explicit camera pose priors or multi-sensor inputs—for autonomous driving. We propose the first end-to-end dynamic NeRF framework that jointly estimates geometry, appearance, and motion without requiring ground-truth poses or auxiliary sensor data. Our method introduces three key innovations: (1) a point-wise dynamic-static decoupling mechanism guided by semantic segmentation to improve separation fidelity; (2) an optical-flow-guided warping ray consistency loss to enforce geometric and rendering coherence for moving objects; and (3) implicit camera pose optimization regularized by dynamic flow constraints. Evaluated on KITTI and Waymo Open Dataset driving sequences, our approach achieves a 23.6% PSNR gain in dynamic object reconstruction over prior methods, significantly mitigating motion blur while attaining state-of-the-art image sharpness and temporal consistency.

Technology Category

Application Category

📝 Abstract
Dynamic scene reconstruction for autonomous driving enables vehicles to perceive and interpret complex scene changes more precisely. Dynamic Neural Radiance Fields (NeRFs) have recently shown promising capability in scene modeling. However, many existing methods rely heavily on accurate poses inputs and multi-sensor data, leading to increased system complexity. To address this, we propose FreeDriveRF, which reconstructs dynamic driving scenes using only sequential RGB images without requiring poses inputs. We innovatively decouple dynamic and static parts at the early sampling level using semantic supervision, mitigating image blurring and artifacts. To overcome the challenges posed by object motion and occlusion in monocular camera, we introduce a warped ray-guided dynamic object rendering consistency loss, utilizing optical flow to better constrain the dynamic modeling process. Additionally, we incorporate estimated dynamic flow to constrain the pose optimization process, improving the stability and accuracy of unbounded scene reconstruction. Extensive experiments conducted on the KITTI and Waymo datasets demonstrate the superior performance of our method in dynamic scene modeling for autonomous driving.
Problem

Research questions and friction points this paper is trying to address.

Reconstruct dynamic driving scenes without pose inputs
Decouple dynamic and static parts using semantic supervision
Improve dynamic modeling with optical flow constraints
Innovation

Methods, ideas, or system contributions that make the work stand out.

Monocular RGB NeRF without pose inputs
Point-level dynamic-static decoupling via semantics
Warped ray-guided dynamic rendering consistency
🔎 Similar Papers
No similar papers found.
Yue Wen
Yue Wen
University of Central Florida
ProstheticsRehabilitation roboticsMachine learningAdaptive controlNeural interface
L
Liang Song
Dimanshen Technology Co., Ltd. specializes in 3D SLAM and robotic vision fusion technology, offering all-terrain intelligent robotic solutions for smart security and smart campus applications.
Yijia Liu
Yijia Liu
Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, School of Information and Control Engineering, Advanced Robotics Research Center, China University of Mining and Technology, Xuzhou 221116, China
S
Siting Zhu
Department of Automation, Key Laboratory of System Control and Information Processing of Ministry of Education, Key Laboratory of Marine Intelligent Equipment and System of Ministry of Education, Shanghai Engineering Research Center of Intelligent Control and Management, Shanghai Jiao Tong University, Shanghai 200240, China
Y
Yanzi Miao
Engineering Research Center of Intelligent Control for Underground Space, Ministry of Education, School of Information and Control Engineering, Advanced Robotics Research Center, China University of Mining and Technology, Xuzhou 221116, China
Lijun Han
Lijun Han
Shanghai Jiaotong University
H
Hesheng Wang
Department of Automation, Key Laboratory of System Control and Information Processing of Ministry of Education, Key Laboratory of Marine Intelligent Equipment and System of Ministry of Education, Shanghai Engineering Research Center of Intelligent Control and Management, Shanghai Jiao Tong University, Shanghai 200240, China