EP-Diffuser: An Efficient Diffusion Model for Traffic Scene Generation and Prediction via Polynomial Representations

📅 2025-04-07
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Long-horizon traffic scene evolution forecasting faces challenges of multimodality and insufficient coverage of the underlying distribution; existing methods typically optimize for a single most-likely trajectory, failing to provide the safety redundancy required by autonomous driving. This paper proposes a lightweight diffusion model based on polynomial implicit representation, conditioned on road topology and agent historical trajectories to generate diverse, physically plausible future scene distributions. Crucially, we integrate polynomial representations into the diffusion process, achieving significant model compression (>40% smaller than SOTA) without sacrificing prediction accuracy, mode coverage, or cross-dataset generalization. Our method achieves state-of-the-art performance on Argoverse 2 and demonstrates strong robustness under out-of-distribution conditions: on Waymo’s OoD test set, it reduces final displacement error (FDE) by 12.3% compared to prior approaches.

Technology Category

Application Category

📝 Abstract
As the prediction horizon increases, predicting the future evolution of traffic scenes becomes increasingly difficult due to the multi-modal nature of agent motion. Most state-of-the-art (SotA) prediction models primarily focus on forecasting the most likely future. However, for the safe operation of autonomous vehicles, it is equally important to cover the distribution for plausible motion alternatives. To address this, we introduce EP-Diffuser, a novel parameter-efficient diffusion-based generative model designed to capture the distribution of possible traffic scene evolutions. Conditioned on road layout and agent history, our model acts as a predictor and generates diverse, plausible scene continuations. We benchmark EP-Diffuser against two SotA models in terms of accuracy and plausibility of predictions on the Argoverse 2 dataset. Despite its significantly smaller model size, our approach achieves both highly accurate and plausible traffic scene predictions. We further evaluate model generalization ability in an out-of-distribution (OoD) test setting using Waymo Open dataset and show superior robustness of our approach. The code and model checkpoints can be found here: https://github.com/continental/EP-Diffuser.
Problem

Research questions and friction points this paper is trying to address.

Predicting multi-modal traffic scene evolution accurately
Generating diverse plausible motion alternatives for safety
Efficient diffusion model for robust scene prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Efficient diffusion model for traffic prediction
Polynomial representations enhance scene generation
Small model size with high accuracy
🔎 Similar Papers
No similar papers found.