🤖 AI Summary
Existing autonomous vehicle (AV) trajectory datasets suffer from significant deficiencies in refinement, reliability, and completeness, hindering rigorous microscopic longitudinal behavior modeling and evaluation. To address this, we introduce Ultra-AV—the first unified, high-quality longitudinal trajectory dataset for AVs, aggregating 14 real-world test sources across diverse vehicle platforms, traffic scenarios, and geographic regions. We propose a novel three-stage standardization pipeline: (i) longitudinal trajectory extraction, (ii) generic cleaning, and (iii) scenario-specific cleaning. Furthermore, we establish a multidimensional validity verification framework assessing safety, efficiency, stability, and sustainability. The released dataset enables high-fidelity car-following modeling and substantially improves training reliability. Evaluated on representative car-following models, Ultra-AV demonstrates strong interpretability and cross-scenario generalizability—providing a robust foundation for algorithm development and standardized longitudinal behavior benchmarking.
📝 Abstract
Automated Vehicles (AVs) promise significant advances in transportation. Critical to these improvements is understanding AVs’ longitudinal behavior, relying heavily on real-world trajectory data. Existing open-source trajectory datasets of AV, however, often fall short in refinement, reliability, and completeness, hindering effective performance metrics analysis and model development. This study addresses these challenges by creating a Unified longitudinal trajectory dataset for AVs (Ultra-AV) to analyze their microscopic longitudinal driving behaviors. This dataset compiles data from 14 distinct sources, encompassing various AV types, test sites, and experiment scenarios. We established a three-step data processing: 1. extraction of longitudinal trajectory data, 2. general data cleaning, and 3. data-specific cleaning to obtain the longitudinal trajectory data and car-following trajectory data. The validity of the processed data is affirmed through performance evaluations across safety, mobility, stability, and sustainability, along with an analysis of the relationships between variables in car-following models. Our work not only furnishes researchers with standardized data and metrics for longitudinal AV behavior studies but also sets guidelines for data collection and model development.