Revisiting Evaluation of Deep Neural Networks for Pedestrian Detection

📅 2025-11-13

📈 Citations: 0

✨ Influential: 0

career value

231K/year

🤖 AI Summary

Existing pedestrian detection evaluation methods rely on coarse-grained metrics, failing to accurately reflect model performance in safety-critical applications such as autonomous driving. To address this, we propose a fine-grained error analysis framework assisted by image segmentation. First, we define eight semantically explicit detection error types—e.g., occlusion-induced false positives and scale mismatch—and design corresponding novel evaluation metrics. Second, we introduce a lightweight APD variant architecture incorporating multi-backbone fusion for systematic comparative experiments. Evaluated on the CityPersons reasonable subset using standard training data only, our method achieves state-of-the-art detection accuracy. More importantly, it significantly enhances evaluation robustness, interpretability, and safety awareness. By enabling precise attribution of failure modes, our framework establishes a new, reliability-oriented evaluation paradigm for high-assurance pedestrian detection systems.

Technology Category

Application Category

📝 Abstract

Reliable pedestrian detection represents a crucial step towards automated driving systems. However, the current performance benchmarks exhibit weaknesses. The currently applied metrics for various subsets of a validation dataset prohibit a realistic performance evaluation of a DNN for pedestrian detection. As image segmentation supplies fine-grained information about a street scene, it can serve as a starting point to automatically distinguish between different types of errors during the evaluation of a pedestrian detector. In this work, eight different error categories for pedestrian detection are proposed and new metrics are proposed for performance comparison along these error categories. We use the new metrics to compare various backbones for a simplified version of the APD, and show a more fine-grained and robust way to compare models with each other especially in terms of safety-critical performance. We achieve SOTA on CityPersons-reasonable (without extra training data) by using a rather simple architecture.

Problem

Research questions and friction points this paper is trying to address.

Current pedestrian detection metrics lack realistic performance evaluation capabilities

Existing benchmarks fail to distinguish between critical error types effectively

Need more fine-grained safety-critical performance comparison for automated driving

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using image segmentation to categorize detection errors

Proposing eight error categories and new evaluation metrics

Achieving state-of-art results with simplified architecture

🔎 Similar Papers

No similar papers found.