Optimization-Guided Diffusion for Interactive Scene Generation

📅 2025-12-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing generative models struggle to simultaneously ensure realism, physical plausibility, and behavioral controllability in multi-agent driving scenarios—particularly for safety-critical scene generation. To address this, we propose OMEGA, a diffusion-based framework that leverages the reverse sampling process to guide agent interactions precisely without additional training. OMEGA integrates constraint optimization with game-theoretic modeling, solving for Nash equilibria directly in the latent distribution space to guarantee both physical compliance and logically consistent adversarial behavior. Evaluated on nuPlan and Waymo datasets, OMEGA increases the proportion of valid scenes from 32.35% to 72.27% under free generation, boosts controllable generation success rate from 11% to 80%, and amplifies near-collision frames by 5×. These improvements significantly enhance the quality and practical utility of generated scenarios for autonomous driving evaluation.

Technology Category

Application Category

📝 Abstract
Realistic and diverse multi-agent driving scenes are crucial for evaluating autonomous vehicles, but safety-critical events which are essential for this task are rare and underrepresented in driving datasets. Data-driven scene generation offers a low-cost alternative by synthesizing complex traffic behaviors from existing driving logs. However, existing models often lack controllability or yield samples that violate physical or social constraints, limiting their usability. We present OMEGA, an optimization-guided, training-free framework that enforces structural consistency and interaction awareness during diffusion-based sampling from a scene generation model. OMEGA re-anchors each reverse diffusion step via constrained optimization, steering the generation towards physically plausible and behaviorally coherent trajectories. Building on this framework, we formulate ego-attacker interactions as a game-theoretic optimization in the distribution space, approximating Nash equilibria to generate realistic, safety-critical adversarial scenarios. Experiments on nuPlan and Waymo show that OMEGA improves generation realism, consistency, and controllability, increasing the ratio of physically and behaviorally valid scenes from 32.35% to 72.27% for free exploration capabilities, and from 11% to 80% for controllability-focused generation. Our approach can also generate $5 imes$ more near-collision frames with a time-to-collision under three seconds while maintaining the overall scene realism.
Problem

Research questions and friction points this paper is trying to address.

Generates realistic multi-agent driving scenes for autonomous vehicle testing
Ensures physical plausibility and behavioral coherence in synthesized traffic scenarios
Creates safety-critical adversarial interactions through game-theoretic optimization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimization-guided diffusion framework for scene generation
Constrained optimization ensures physical and behavioral plausibility
Game-theoretic optimization generates adversarial safety-critical scenarios
🔎 Similar Papers
No similar papers found.
S
Shiaho Li
Beijing Institute of Technology
N
Naisheng Ye
OpenDriveLab at The University of Hong Kong
T
Tianyu Li
OpenDriveLab at The University of Hong Kong
Kashyap Chitta
Kashyap Chitta
NVIDIA
Physical AIAutonomous DrivingRobot Learning
T
Tuo An
Nanyang Technological University
Peng Su
Peng Su
Ph.D at The Chinese University of Hong Kong
Deep LearningPhysical AIAutonomous Driving
B
Boyang Wang
Beijing Institute of Technology
H
Haiou Liu
Beijing Institute of Technology
C
Chen Lv
Nanyang Technological University
H
Hongyang Li
OpenDriveLab at The University of Hong Kong