🤖 AI Summary
This work addresses the lack of structured, semantically interpretable scene descriptions in autonomous driving trajectory data. We propose an end-to-end framework that automatically generates Structured Description Language (SDL) labels from raw trajectories, enabling behavior comparison, similarity retrieval, and edge-case detection. Our method introduces a novel rule-guided cross-entropy optimization scheme that jointly optimizes SDL parameters by integrating domain knowledge with data-driven learning, ensuring both efficiency and strong generalization. The framework incorporates trajectory representation learning and a Waymo Open Motion Dataset (WOMD)-adapted module, eliminating the need for manual annotation. Evaluated on WOMD, our approach achieves 30% higher SDL generation accuracy than ADE-based baselines and 24% higher than DTW-based ones. To our knowledge, this is the first fully automated method for semantic, interpretable identification of distinct driving behaviors—significantly accelerating safety validation in autonomous driving systems.
📝 Abstract
Scenario Description Languages (SDLs) provide structured, interpretable embeddings that represent traffic scenarios encountered by autonomous vehicles (AVs), supporting key tasks such as scenario similarity searches and edge case detection for safety analysis. This paper introduces the Trajectory-to-Action Pipeline (TAP), a scalable and automated method for extracting SDL labels from large trajectory datasets. TAP applies a rules-based cross-entropy optimization approach to learn parameters directly from data, enhancing generalization across diverse driving contexts. Using the Waymo Open Motion Dataset (WOMD), TAP achieves 30% greater precision than Average Displacement Error (ADE) and 24% over Dynamic Time Warping (DTW) in identifying behaviorally similar trajectories. Additionally, TAP enables automated detection of unique driving behaviors, streamlining safety evaluation processes for AV testing. This work provides a foundation for scalable scenario-based AV behavior analysis, with potential extensions for integrating multi-agent contexts.