HorizonForge: Driving Scene Editing with Any Trajectories and Any Vehicles

📅 2026-02-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing autonomous driving simulation methods struggle to balance photorealism and controllability in scene generation. This work proposes a Gaussian-grid hybrid representation to construct a unified, editable 3D scene framework that supports language-driven vehicle insertion and fine-grained trajectory manipulation. To ensure spatiotemporal consistency, the method incorporates noise-aware video diffusion rendering and operates under a single-pass feedforward editing paradigm—eliminating the need for per-trajectory optimization. To the best of our knowledge, this is the first approach to achieve high-fidelity, arbitrarily controllable driving scene editing. Experiments demonstrate an 83.4% improvement in user preference and a 25.19-point reduction in FID over the next-best method. The authors also introduce HorizonSuite, a comprehensive benchmark for evaluating controllable driving simulation.

Technology Category

Application Category

📝 Abstract
Controllable driving scene generation is critical for realistic and scalable autonomous driving simulation, yet existing approaches struggle to jointly achieve photorealism and precise control. We introduce HorizonForge, a unified framework that reconstructs scenes as editable Gaussian Splats and Meshes, enabling fine-grained 3D manipulation and language-driven vehicle insertion. Edits are rendered through a noise-aware video diffusion process that enforces spatial and temporal consistency, producing diverse scene variations in a single feed-forward pass without per-trajectory optimization. To standardize evaluation, we further propose HorizonSuite, a comprehensive benchmark spanning ego- and agent-level editing tasks such as trajectory modifications and object manipulation. Extensive experiments show that Gaussian-Mesh representation delivers substantially higher fidelity than alternative 3D representations, and that temporal priors from video diffusion are essential for coherent synthesis. Combining these findings, HorizonForge establishes a simple yet powerful paradigm for photorealistic, controllable driving simulation, achieving an 83.4% user-preference gain and a 25.19% FID improvement over the second best state-of-the-art method. Project page: https://horizonforge.github.io/ .
Problem

Research questions and friction points this paper is trying to address.

driving scene generation
photorealism
precise control
3D scene editing
autonomous driving simulation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Gaussian Splatting
Video Diffusion
Controllable Scene Generation
Driving Simulation
3D Scene Editing
🔎 Similar Papers
No similar papers found.