Post-interactive Multimodal Trajectory Prediction for Autonomous Driving

📅 2025-03-12

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Existing multi-agent trajectory prediction methods for autonomous driving inadequately model agent interactions—particularly neglecting *post-interaction* dynamic dependencies among *predicted* trajectories. Method: We propose Pioformer, the first Transformer architecture explicitly modeling coarse-to-fine post-interaction relationships among predicted trajectories. It introduces a novel three-stage collaborative training paradigm integrating low-order graph neural networks, high-order hypergraph neural networks, and trajectory consistency constraints to hierarchically encode interaction features and enable multimodal decoding. Results: On Argoverse 1, Pioformer achieves significant improvements over HiVT-64: minADE₆ ↓4.4%, minFDE₆ ↓8.4%, MR₆ ↓14.4%, and Brier-minFDE₆ ↓5.7%, demonstrating superior prediction accuracy and better uncertainty calibration.

Technology Category

Application Category

📝 Abstract

Modeling the interactions among agents for trajectory prediction of autonomous driving has been challenging due to the inherent uncertainty in agents' behavior. The interactions involved in the predicted trajectories of agents, also called post-interactions, have rarely been considered in trajectory prediction models. To this end, we propose a coarse-to-fine Transformer for multimodal trajectory prediction, i.e., Pioformer, which explicitly extracts the post-interaction features to enhance the prediction accuracy. Specifically, we first build a Coarse Trajectory Network to generate coarse trajectories based on the observed trajectories and lane segments, in which the low-order interaction features are extracted with the graph neural networks. Next, we build a hypergraph neural network-based Trajectory Proposal Network to generate trajectory proposals, where the high-order interaction features are learned by the hypergraphs. Finally, the trajectory proposals are sent to the Proposal Refinement Network for further refinement. The observed trajectories and trajectory proposals are concatenated together as the inputs of the Proposal Refinement Network, in which the post-interaction features are learned by combining the previous interaction features and trajectory consistency features. Moreover, we propose a three-stage training scheme to facilitate the learning process. Extensive experiments on the Argoverse 1 dataset demonstrate the superiority of our method. Compared with the baseline HiVT-64, our model has reduced the prediction errors by 4.4%, 8.4%, 14.4%, 5.7% regarding metrics minADE6, minFDE6, MR6, and brier-minFDE6, respectively.

Problem

Research questions and friction points this paper is trying to address.

Modeling agent interactions for autonomous driving trajectory prediction.

Incorporating post-interaction features to enhance prediction accuracy.

Proposing a coarse-to-fine Transformer for multimodal trajectory prediction.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Coarse-to-fine Transformer for trajectory prediction

Hypergraph neural network for high-order interactions

Three-stage training scheme for enhanced learning

🔎 Similar Papers

EqDrive: Efficient Equivariant Motion Forecasting with Multi-Modality for Autonomous Driving