OG-Gaussian: Occupancy Based Street Gaussians for Autonomous Driving

📅 2025-02-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high cost of LiDAR reliance and manual annotation in autonomous driving simulation, this paper proposes a purely vision-based, unsupervised method for dynamic 3D street scene reconstruction. Our approach eliminates the need for LiDAR data or pre-annotated supervision, operating solely on surround-view camera images. The core contributions are: (1) an Occupancy Prediction Network (ONet) that generates semantic-aware voxel grids—replacing LiDAR point clouds—as geometric priors to guide 3D Gaussian reconstruction; and (2) a dual-stream 3D Gaussian Splatting framework that jointly optimizes static scene rendering and dynamic vehicle pose/trajectory estimation, enabling end-to-end co-modeling of static and dynamic elements. Evaluated on the Waymo Open Dataset, our method achieves 35.13 PSNR and 143 FPS rendering speed—matching state-of-the-art performance—while substantially reducing computational and hardware costs.

Technology Category

Application Category

📝 Abstract
Accurate and realistic 3D scene reconstruction enables the lifelike creation of autonomous driving simulation environments. With advancements in 3D Gaussian Splatting (3DGS), previous studies have applied it to reconstruct complex dynamic driving scenes. These methods typically require expensive LiDAR sensors and pre-annotated datasets of dynamic objects. To address these challenges, we propose OG-Gaussian, a novel approach that replaces LiDAR point clouds with Occupancy Grids (OGs) generated from surround-view camera images using Occupancy Prediction Network (ONet). Our method leverages the semantic information in OGs to separate dynamic vehicles from static street background, converting these grids into two distinct sets of initial point clouds for reconstructing both static and dynamic objects. Additionally, we estimate the trajectories and poses of dynamic objects through a learning-based approach, eliminating the need for complex manual annotations. Experiments on Waymo Open dataset demonstrate that OG-Gaussian is on par with the current state-of-the-art in terms of reconstruction quality and rendering speed, achieving an average PSNR of 35.13 and a rendering speed of 143 FPS, while significantly reducing computational costs and economic overhead.
Problem

Research questions and friction points this paper is trying to address.

Reconstruct 3D scenes without LiDAR
Separate dynamic and static objects
Estimate object trajectories without annotations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Occupancy Grids replace LiDAR
Learning-based trajectory estimation
Semantic separation of dynamic objects
🔎 Similar Papers
No similar papers found.
Y
Yedong Shen
School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China
Xinran Zhang
Xinran Zhang
University of Science and Technology of China
SLAMNeRF3DGS
Y
Yifan Duan
School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China
S
Shiqi Zhang
School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei 230026, China
H
Heng Li
School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China
Yilong Wu
Yilong Wu
Fudan University
Natural Language Processing
Jianmin Ji
Jianmin Ji
University of Science and Technology of China
Cognitive RoboticsReinforcement LearningAnswer Set Programming
Yanyong Zhang
Yanyong Zhang
University of Science and Technology of China ; Rutgers University (Adjunct Visiting Professor)
SensingCyber-Physical SystemsMulti-Modal PerceptionEfficient AI Systems