🤖 AI Summary
This paper addresses the challenge of identifying interactive scenarios in autonomous vehicle (AV) safety assessment. We propose “Surprise Potential,” a novel metric that quantifies the deviation induced in AV-predicted trajectories of other traffic participants, enabling automatic identification of high-interaction critical scenarios from real-world driving logs. Methodologically, we introduce human preference learning–derived reward functions as the ground-truth benchmark for interaction labeling; integrate nuScenes-based trajectory prediction, counterfactual perturbation, and exhaustive design-space enumeration to rigorously evaluate candidate metrics; and demonstrate that the optimal Surprise Potential formulation achieves strong alignment with human intuition (correlation > 0.82). Compared to existing approaches, our metric significantly improves recognition accuracy for interactive scenarios, effectively filters high-value test cases, and enhances the discriminative power and practical utility of motion planner robustness evaluation.
📝 Abstract
Validating the safety and performance of an autonomous vehicle (AV) requires benchmarking on real-world driving logs. However, typical driving logs contain mostly uneventful scenarios with minimal interactions between road users. Identifying interactive scenarios in real-world driving logs enables the curation of datasets that amplify critical signals and provide a more accurate assessment of an AV's performance. In this paper, we present a novel metric that identifies interactive scenarios by measuring an AV's surprise potential on others. First, we identify three dimensions of the design space to describe a family of surprise potential measures. Second, we exhaustively evaluate and compare different instantiations of the surprise potential measure within this design space on the nuScenes dataset. To determine how well a surprise potential measure correctly identifies an interactive scenario, we use a reward model learned from human preferences to assess alignment with human intuition. Our proposed surprise potential, arising from this exhaustive comparative study, achieves a correlation of more than 0.82 with the human-aligned reward function, outperforming existing approaches. Lastly, we validate motion planners on curated interactive scenarios to demonstrate downstream applications.