π€ AI Summary
This work addresses multi-agent motion prediction for autonomous driving, jointly modeling both marginal trajectory distributions of individual agents and joint trajectory distributions of interacting agents. To tackle modeling challenges arising from scene constraints and complex agent interactions, we propose a novel multi-task learning framework incorporating inverse causal information flow. Specifically, we introduce a trajectory-level inverse causal information propagation mechanism that enables goal-directed and scene-adaptive directional instruction injection. Built upon a Transformer architecture, our method models positional uncertainty via a compressed exponential power distribution and synergistically generates joint distributions through marginal trajectory re-encoding and pairwise joint modeling. Our approach achieves state-of-the-art performance on Waymo Interaction Prediction and demonstrates strong generalization on Argoverse 2. The implementation is publicly available.
π Abstract
Motion forecasts of road users (i.e., agents) vary in complexity as a function of scene constraints and interactive behavior. We address this with a multi-task learning method for motion forecasting that includes a retrocausal flow of information. The corresponding tasks are to forecast (1) marginal trajectory distributions for all modeled agents and (2) joint trajectory distributions for interacting agents. Using a transformer model, we generate the joint distributions by re-encoding marginal distributions followed by pairwise modeling. This incorporates a retrocausal flow of information from later points in marginal trajectories to earlier points in joint trajectories. Per trajectory point, we model positional uncertainty using compressed exponential power distributions. Notably, our method achieves state-of-the-art results in the Waymo Interaction Prediction dataset and generalizes well to the Argoverse 2 dataset. Additionally, our method provides an interface for issuing instructions through trajectory modifications. Our experiments show that regular training of motion forecasting leads to the ability to follow goal-based instructions and to adapt basic directional instructions to the scene context. Code: https://github.com/kit-mrt/future-motion