🤖 AI Summary
This work addresses motion prediction for cooperative autonomous vehicles (CAVs) under V2X communication. To tackle real-world challenges—including transmission latency and the high computational cost of sharing large-scale feature representations—we propose the first unified framework integrating cooperative perception and cooperative motion prediction. Our method constructs a multi-agent graph structure from LiDAR inputs to jointly model inter-vehicle perception and prediction dependencies. It incorporates a latency-robust graph communication mechanism, a latency-aware feature alignment module, and a multi-vehicle weighted prediction fusion strategy. Evaluated on OPV2V and V2V4Real benchmarks, our approach reduces average prediction error by 12.3%, significantly improving trajectory tracking and forecasting accuracy in complex interactive scenarios. Results demonstrate the effectiveness and practicality of joint cooperative modeling under latency-tolerant conditions.
📝 Abstract
The confluence of the advancement of Autonomous Vehicles (AVs) and the maturity of Vehicle-to-Everything (V2X) communication has enabled the capability of cooperative connected and automated vehicles (CAVs). Building on top of cooperative perception, this paper explores the feasibility and effectiveness of cooperative motion prediction. Our method, CMP, takes LiDAR signals as model input to enhance tracking and prediction capabilities. Unlike previous work that focuses separately on either cooperative perception or motion prediction, our framework, to the best of our knowledge, is the first to address the unified problem where CAVs share information in both perception and prediction modules. Incorporated into our design is the unique capability to tolerate realistic V2X transmission delays, while dealing with bulky perception representations. We also propose a prediction aggregation module, which unifies the predictions obtained by different CAVs and generates the final prediction. Through extensive experiments and ablation studies on the OPV2V and V2V4Real datasets, we demonstrate the effectiveness of our method in cooperative perception, tracking, and motion prediction. In particular, CMP reduces the average prediction error by 12.3% compared with the strongest baseline. Our work marks a significant step forward in the cooperative capabilities of CAVs, showcasing enhanced performance in complex scenarios. More details can be found on the project website: https://cmp-cooperative-prediction.github.io.