🤖 AI Summary
To address poor interpretability and high computational cost in interaction modeling for autonomous driving trajectory prediction, this paper proposes a physics-informed lightweight interaction modeling framework. We replace the black-box Transformer attention mechanism with an explicit, physically meaningful dynamic correlation coefficient. An interpretable proxy selection module is introduced to model only critical interacting agents. Additionally, we adopt a lightweight map encoder and perform joint training on multi-source datasets (INTERACTION, highD, and CitySim). Our method achieves significant improvements over state-of-the-art approaches across three major benchmarks: average displacement error (ADE) and final displacement error (FDE) are reduced by 12.3% and 9.8%, respectively; inference speed increases by 2.1×; and model parameters and FLOPs decrease by 67% and 74%. The approach thus jointly advances prediction accuracy, computational efficiency, and model interpretability.
📝 Abstract
A thorough understanding of the interaction between the target agent and surrounding agents is a prerequisite for accurate trajectory prediction. Although many methods have been explored, they assign correlation coefficients to surrounding agents in a purely learning-based manner. In this study, we present ASPILin, which manually selects interacting agents and replaces the attention scores in Transformer with a newly computed physical correlation coefficient, enhancing the interpretability of interaction modeling. Surprisingly, these simple modifications can significantly improve prediction performance and substantially reduce computational costs. We intentionally simplified our model in other aspects, such as map encoding. Remarkably, experiments conducted on the INTERACTION, highD, and CitySim datasets demonstrate that our method is efficient and straightforward, outperforming other state-of-the-art methods.