Revisiting Evaluation of Deep Neural Networks for Pedestrian Detection

📅 2025-11-13
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing pedestrian detection evaluation methods rely on coarse-grained metrics, failing to accurately reflect model performance in safety-critical applications such as autonomous driving. To address this, we propose a fine-grained error analysis framework assisted by image segmentation. First, we define eight semantically explicit detection error types—e.g., occlusion-induced false positives and scale mismatch—and design corresponding novel evaluation metrics. Second, we introduce a lightweight APD variant architecture incorporating multi-backbone fusion for systematic comparative experiments. Evaluated on the CityPersons reasonable subset using standard training data only, our method achieves state-of-the-art detection accuracy. More importantly, it significantly enhances evaluation robustness, interpretability, and safety awareness. By enabling precise attribution of failure modes, our framework establishes a new, reliability-oriented evaluation paradigm for high-assurance pedestrian detection systems.

Technology Category

Application Category

📝 Abstract
Reliable pedestrian detection represents a crucial step towards automated driving systems. However, the current performance benchmarks exhibit weaknesses. The currently applied metrics for various subsets of a validation dataset prohibit a realistic performance evaluation of a DNN for pedestrian detection. As image segmentation supplies fine-grained information about a street scene, it can serve as a starting point to automatically distinguish between different types of errors during the evaluation of a pedestrian detector. In this work, eight different error categories for pedestrian detection are proposed and new metrics are proposed for performance comparison along these error categories. We use the new metrics to compare various backbones for a simplified version of the APD, and show a more fine-grained and robust way to compare models with each other especially in terms of safety-critical performance. We achieve SOTA on CityPersons-reasonable (without extra training data) by using a rather simple architecture.
Problem

Research questions and friction points this paper is trying to address.

Current pedestrian detection metrics lack realistic performance evaluation capabilities
Existing benchmarks fail to distinguish between critical error types effectively
Need more fine-grained safety-critical performance comparison for automated driving
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using image segmentation to categorize detection errors
Proposing eight error categories and new evaluation metrics
Achieving state-of-art results with simplified architecture
🔎 Similar Papers
No similar papers found.
P
Patrick Feifel
Stellantis, Opel Automobile GmbH; Carl von Ossietzky Universität Oldenburg
B
Benedikt Franke
Deutsches Zentrum für Luft- und Raumfahrt; Universität Ulm
F
Frank Bonarens
Stellantis, Opel Automobile GmbH
F
Frank Koster
Deutsches Zentrum für Luft- und Raumfahrt; Carl von Ossietzky Universität Oldenburg
A
Arne Raulf
Deutsches Zentrum für Luft- und Raumfahrt
Friedhelm Schwenker
Friedhelm Schwenker
Ulm University, Institute of Neural Information Processing
neural networksmachine learningpattern recognitiondata miningaffective computing