Optimization-Guided Diffusion for Interactive Scene Generation

📅 2025-12-08

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing generative models struggle to simultaneously ensure realism, physical plausibility, and behavioral controllability in multi-agent driving scenarios—particularly for safety-critical scene generation. To address this, we propose OMEGA, a diffusion-based framework that leverages the reverse sampling process to guide agent interactions precisely without additional training. OMEGA integrates constraint optimization with game-theoretic modeling, solving for Nash equilibria directly in the latent distribution space to guarantee both physical compliance and logically consistent adversarial behavior. Evaluated on nuPlan and Waymo datasets, OMEGA increases the proportion of valid scenes from 32.35% to 72.27% under free generation, boosts controllable generation success rate from 11% to 80%, and amplifies near-collision frames by 5×. These improvements significantly enhance the quality and practical utility of generated scenarios for autonomous driving evaluation.

Technology Category

Application Category

📝 Abstract

Realistic and diverse multi-agent driving scenes are crucial for evaluating autonomous vehicles, but safety-critical events which are essential for this task are rare and underrepresented in driving datasets. Data-driven scene generation offers a low-cost alternative by synthesizing complex traffic behaviors from existing driving logs. However, existing models often lack controllability or yield samples that violate physical or social constraints, limiting their usability. We present OMEGA, an optimization-guided, training-free framework that enforces structural consistency and interaction awareness during diffusion-based sampling from a scene generation model. OMEGA re-anchors each reverse diffusion step via constrained optimization, steering the generation towards physically plausible and behaviorally coherent trajectories. Building on this framework, we formulate ego-attacker interactions as a game-theoretic optimization in the distribution space, approximating Nash equilibria to generate realistic, safety-critical adversarial scenarios. Experiments on nuPlan and Waymo show that OMEGA improves generation realism, consistency, and controllability, increasing the ratio of physically and behaviorally valid scenes from 32.35% to 72.27% for free exploration capabilities, and from 11% to 80% for controllability-focused generation. Our approach can also generate $5 imes$ more near-collision frames with a time-to-collision under three seconds while maintaining the overall scene realism.

Problem

Research questions and friction points this paper is trying to address.

Generates realistic multi-agent driving scenes for autonomous vehicle testing

Ensures physical plausibility and behavioral coherence in synthesized traffic scenarios

Creates safety-critical adversarial interactions through game-theoretic optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Optimization-guided diffusion framework for scene generation

Constrained optimization ensures physical and behavioral plausibility

Game-theoretic optimization generates adversarial safety-critical scenarios

🔎 Similar Papers

LT3SD: Latent Trees for 3D Scene Diffusion