Realistic and Controllable 3D Gaussian-Guided Object Editing for Driving Video Generation

📅 2025-08-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Addressing the scarcity, high acquisition cost, and safety risks associated with extreme corner cases in autonomous driving, this paper proposes G²Editor—a novel 3D-aware video editing framework. It pioneers the use of 3D Gaussian splatting as a dense geometric prior, jointly enabling scene-level 3D bounding box layout reconstruction and occlusion-aware inpainting to achieve precise relocalization, insertion, and removal of target vehicles. A hierarchical, fine-grained feature-guided diffusion denoising mechanism is introduced to significantly enhance visual realism and spatial consistency of edited videos. Experiments on the Waymo Open Dataset demonstrate that G²Editor outperforms state-of-the-art image- and 3D-based editing methods in both pose control accuracy and appearance fidelity. Moreover, it substantially improves downstream perception and planning performance, validating its practical utility for autonomous driving simulation and robustness evaluation.

Technology Category

Application Category

📝 Abstract
Corner cases are crucial for training and validating autonomous driving systems, yet collecting them from the real world is often costly and hazardous. Editing objects within captured sensor data offers an effective alternative for generating diverse scenarios, commonly achieved through 3D Gaussian Splatting or image generative models. However, these approaches often suffer from limited visual fidelity or imprecise pose control. To address these issues, we propose G^2Editor, a framework designed for photorealistic and precise object editing in driving videos. Our method leverages a 3D Gaussian representation of the edited object as a dense prior, injected into the denoising process to ensure accurate pose control and spatial consistency. A scene-level 3D bounding box layout is employed to reconstruct occluded areas of non-target objects. Furthermore, to guide the appearance details of the edited object, we incorporate hierarchical fine-grained features as additional conditions during generation. Experiments on the Waymo Open Dataset demonstrate that G^2Editor effectively supports object repositioning, insertion, and deletion within a unified framework, outperforming existing methods in both pose controllability and visual quality, while also benefiting downstream data-driven tasks.
Problem

Research questions and friction points this paper is trying to address.

Generates realistic driving videos with precise object editing
Addresses limited visual fidelity in 3D Gaussian editing methods
Solves imprecise pose control in driving scenario generation
Innovation

Methods, ideas, or system contributions that make the work stand out.

3D Gaussian representation for pose control
Scene-level 3D bounding box for occlusion handling
Hierarchical fine-grained features for appearance guidance
J
Jiusi Li
School of Vehicle and Mobility, and State Key Laboratory of Intelligent Green Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
J
Jackson Jiang
WUWEN AI, Beijing 100084, China
J
Jinyu Miao
School of Vehicle and Mobility, and State Key Laboratory of Intelligent Green Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
M
Miao Long
School of Vehicle and Mobility, and State Key Laboratory of Intelligent Green Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
T
Tuopu Wen
School of Vehicle and Mobility, and State Key Laboratory of Intelligent Green Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
P
Peijin Jia
School of Vehicle and Mobility, and State Key Laboratory of Intelligent Green Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
S
Shengxiang Liu
WUWEN AI, Beijing 100084, China
C
Chunlei Yu
WUWEN AI, Beijing 100084, China
M
Maolin Liu
WUWEN AI, Beijing 100084, China
Y
Yuzhan Cai
PhiGent Robotics, Beijing 100084, China
Kun Jiang
Kun Jiang
Tsinghua University
autonomous driving
M
Mengmeng Yang
School of Vehicle and Mobility, and State Key Laboratory of Intelligent Green Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
D
Diange Yang
School of Vehicle and Mobility, and State Key Laboratory of Intelligent Green Vehicle and Mobility, Tsinghua University, Beijing, 100084, China