🤖 AI Summary
Pedestrian trajectory prediction in urban traffic scenarios suffers from insufficient traffic context modeling and a fundamental trade-off between real-time performance and prediction reliability.
Method: We propose Snapshot—a lightweight, modular, agent-centric feedforward neural network. We introduce the first traffic-aware pedestrian forecasting benchmark on Argoverse 2, explicitly incorporating traffic signals, lane topology, and interactive agents. Instead of sequential modeling, Snapshot performs feedforward inference solely from a short historical observation window. We further design a traffic-aware feature encoding scheme and a dedicated data adaptation pipeline.
Contribution/Results: Snapshot achieves an 8.8% improvement in Average Displacement Error (ADE) over state-of-the-art methods. Its inference latency satisfies stringent onboard real-time deployment requirements (<50 ms per agent). The model has been fully integrated end-to-end into an autonomous driving software stack and validated on real vehicles.
📝 Abstract
This paper explores pedestrian trajectory prediction in urban traffic while focusing on both model accuracy and real-world applicability. While promising approaches exist, they often revolve around pedestrian datasets excluding traffic-related information, or resemble architectures that are either not real-time capable or robust. To address these limitations, we first introduce a dedicated benchmark based on Argoverse 2, specifically targeting pedestrians in traffic environments. Following this, we present Snapshot, a modular, feed-forward neural network that outperforms the current state of the art, reducing the Average Displacement Error (ADE) by 8.8% while utilizing significantly less information. Despite its agent-centric encoding scheme, Snapshot demonstrates scalability, real-time performance, and robustness to varying motion histories. Moreover, by integrating Snapshot into a modular autonomous driving software stack, we showcase its real-world applicability.