CounterScene: Counterfactual Causal Reasoning in Generative World Models for Safety-Critical Closed-Loop Evaluation

📅 2026-03-22

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing methods for generating safety-critical driving scenarios struggle to balance realism and adversarial effectiveness due to their lack of explicit modeling of inter-agent interaction dependencies. This work proposes a closed-loop, generative bird’s-eye-view (BEV) world model grounded in structured counterfactual reasoning. By leveraging causal interaction graphs, the approach identifies causally critical agents and applies minimal interventions to steer risk propagation through naturalistic interactions. The method integrates causal adversarial agent identification, conflict-type classification, and phase-adaptive counterfactual guidance to overcome the realism–adversariality trade-off. Experiments demonstrate significant improvements: on nuScenes, the long-horizon collision rate increases from 12.3% to 22.7% while achieving superior trajectory realism (ADE of 1.88 vs. 2.09); on nuPlan, it attains state-of-the-art zero-shot realism.

Technology Category

Application Category

📝 Abstract

Generating safety-critical driving scenarios requires understanding why dangerous interactions arise, rather than merely forcing collisions. However, existing methods rely on heuristic adversarial agent selection and unstructured perturbations, lacking explicit modeling of interaction dependencies and thus exhibiting a realism--adversarial trade-off. We present CounterScene, a framework that endows closed-loop generative BEV world models with structured counterfactual reasoning for safety-critical scenario generation. Given a safe scene, CounterScene asks: what if the causally critical agent had behaved differently? To answer this, we introduce causal adversarial agent identification to identify the critical agent and classify conflict types, and develop a conflict-aware interactive world model in which a causal interaction graph is used to explicitly model dynamic inter-agent dependencies. Building on this structure, stage-adaptive counterfactual guidance performs minimal interventions on the identified agent, removing its spatial and temporal safety margins while allowing risk to emerge through natural interaction propagation. Extensive experiments on nuScenes demonstrate that CounterScene achieves the strongest adversarial effectiveness while maintaining superior trajectory realism across all horizons, improving long-horizon collision rate from 12.3% to 22.7% over the strongest baseline with better realism (ADE 1.88 vs.2.09). Notably, this advantage further widens over longer rollouts, and CounterScene generalizes zero-shot to nuPlan with state-of-the-art realism.

Problem

Research questions and friction points this paper is trying to address.

counterfactual reasoning

safety-critical scenarios

generative world models

causal interaction

closed-loop evaluation

Innovation

Methods, ideas, or system contributions that make the work stand out.

counterfactual reasoning

causal interaction graph

generative world model