๐ค AI Summary
This work addresses the challenge of high inference latency in diffusion-based multi-agent trajectory prediction, which hinders real-time deployment in applications such as autonomous driving. To overcome this limitation, the authors propose ECTraj, a framework built upon consistency models that leverages a teacherโstudent distillation architecture. The teacher model incorporates partial ground-truth trajectories to provide strong supervision, while the student model enables single-step, high-quality trajectory generation. During training, the approach supports top-K multi-hypothesis prediction to capture diverse future behaviors. Evaluated on the Argoverse 2 dataset, ECTraj achieves significant improvements in both inference speed and prediction accuracy, establishing a new state-of-the-art benchmark for real-time trajectory forecasting.
๐ Abstract
Diffusion models for multi-agent trajectory prediction are limited by iterative denoising, which causes inference latency that hinders their use in time-critical settings like autonomous driving. Fast-sampling variants using DDIM and informed initial noise distributions partially alleviate this issue, but they either fail to achieve true single-step generation or are constrained by the chosen noise distribution. Consistency Models (CMs) offer high-quality one-step generation by mapping noise directly to data, but are difficult to train from scratch . We propose ECTraj, an enhanced CM pipeline with improved training and conditional generation for trajectory prediction. Our framework extends the student-teacher consistency training scheme: the student produces standard outputs, while the teacher explicitly fuses its predictions with parts of the ground truth to give stronger supervision. We also exploit CMs' direct denoising for top-K multi-shot generation during training. Combining conditional generation with this enhanced consistency objective yields faster inference and improved prediction accuracy, establishing competitive new benchmarks on the large-scale Argoverse 2 dataset.