Beyond ADE and FDE: A Comprehensive Evaluation Framework for Safety-Critical Prediction in Multi-Agent Autonomous Driving Scenarios

📅 2025-10-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing autonomous driving prediction evaluation relies excessively on scalar metrics such as ADE/FDE, failing to capture safety-critical behavioral discrepancies under multi-agent interaction and lacking systematic robustness testing across scene topology, map context, and agent spatial distribution. Method: We propose the first scenario-aware, comprehensive evaluation framework. It employs controlled-variable experiments to quantify proximity effects—revealing how close-range interactions critically degrade prediction accuracy—and leverages multidimensional real-world driving data to uncover failure modes masked by conventional metrics. Contribution/Results: Experiments demonstrate significant vulnerabilities of state-of-the-art models under high-density traffic and specific spatial configurations. Our framework fills a critical methodological gap in evaluating prediction models under complex interactive scenarios and provides a reproducible, benchmarking toolkit to enhance model safety and robustness.

Technology Category

Application Category

📝 Abstract
Current evaluation methods for autonomous driving prediction models rely heavily on simplistic metrics such as Average Displacement Error (ADE) and Final Displacement Error (FDE). While these metrics offer basic performance assessments, they fail to capture the nuanced behavior of prediction modules under complex, interactive, and safety-critical driving scenarios. For instance, existing benchmarks do not distinguish the influence of nearby versus distant agents, nor systematically test model robustness across varying multi-agent interactions. This paper addresses this critical gap by proposing a novel testing framework that evaluates prediction performance under diverse scene structures, saying, map context, agent density and spatial distribution. Through extensive empirical analysis, we quantify the differential impact of agent proximity on target trajectory prediction and identify scenario-specific failure cases that are not exposed by traditional metrics. Our findings highlight key vulnerabilities in current state-of-the-art prediction models and demonstrate the importance of scenario-aware evaluation. The proposed framework lays the groundwork for rigorous, safety-driven prediction validation, contributing significantly to the identification of failure-prone corner cases and the development of robust, certifiable prediction systems for autonomous vehicles.
Problem

Research questions and friction points this paper is trying to address.

Evaluates prediction models beyond basic ADE/FDE metrics in autonomous driving
Assesses model robustness across varying multi-agent interactions and densities
Identifies safety-critical failure cases in complex interactive driving scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes novel testing framework for prediction evaluation
Quantifies agent proximity impact on trajectory prediction
Identifies scenario-specific failure cases beyond traditional metrics
🔎 Similar Papers
No similar papers found.
F
Feifei Liu
South China Normal University, Shanwei, Guangdong, China
H
Haozhe Wang
South China Normal University, Shanwei, Guangdong, China
Z
Zejun Wei
South China Normal University, Shanwei, Guangdong, China
Q
Qirong Lu
South China Normal University, Shanwei, Guangdong, China
Y
Yiyang Wen
Shenzhen Technology University, Shenzhen, Guangdong, China
X
Xiaoyu Tang
South China Normal University, Shanwei, Guangdong, China
Jingyan Jiang
Jingyan Jiang
Shen Zhen Technology University
Test-time adaptation, Embodied AI,Machine learning system
Z
Zhijian He
Shenzhen Technology University, Shenzhen, Guangdong, China