🤖 AI Summary
To address insufficient trajectory diversity and poor road compliance in out-of-distribution (OOD) maps—both stemming from overreliance on historical agent trajectories—this paper proposes a two-stage diffusion model. In the first stage, candidate trajectories are generated in the Frenet coordinate system under explicit road constraints, leveraging prior knowledge of map geometry and motion primitives. The second stage models multi-agent interactions to jointly initialize poses and generate trajectories without contextual conditioning. Crucially, the method operates solely on static map inputs, eliminating dependence on historical agent data. Experiments on Argoverse 2 demonstrate significant improvements: a 1.92× gain in trajectory distribution quality, a 1.14× improvement in commonsense plausibility, and a 1.62× increase in road compliance. These advances substantially enhance the generalization capability of autonomous driving planners during zero-shot or OOD evaluation.
📝 Abstract
Simulating diverse and realistic traffic scenarios is critical for developing and testing autonomous planning. Traditional rule-based planners lack diversity and realism, while learning-based simulators often replay, forecast, or edit scenarios using historical agent trajectories. However, they struggle to generate new scenarios, limiting scalability and diversity due to their reliance on fully annotated logs and historical data. Thus, a key challenge for a learning-based simulator's performance is that it requires agents' past trajectories and pose information in addition to map data, which might not be available for all agents on the road.Without which, generated scenarios often produce unrealistic trajectories that deviate from drivable areas, particularly under out-of-distribution (OOD) map scenes (e.g., curved roads). To address this, we propose Path Diffuser (PD): a two-stage, diffusion model for generating agent pose initializations and their corresponding trajectories conditioned on the map, free of any historical context of agents' trajectories. Furthermore, PD incorporates a motion primitive-based prior, leveraging Frenet frame candidate trajectories to enhance diversity while ensuring road-compliant trajectory generation. We also explore various design choices for modeling complex multi-agent interactions. We demonstrate the effectiveness of our method through extensive experiments on the Argoverse2 Dataset and additionally evaluate the generalizability of the approach on OOD map variants. Notably, Path Diffuser outperforms the baseline methods by 1.92x on distribution metrics, 1.14x on common-sense metrics, and 1.62x on road compliance from adversarial benchmarks.