How good are deep learning methods for automated road safety analysis using video data? An experimental study

📅 2025-03-12

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

This study systematically evaluates the applicability of deep learning-based multi-object tracking (MOT) methods to video-driven road safety analysis. Method: Leveraging the KITTI traffic video dataset, we integrate 2D/3D detection (monocular/stereo), trajectory extraction, and surrogate safety metrics—including time-to-collision (TTC) and post-encroachment time (PET)—and propose two novel post-processing strategies, IDsplit and SS, to enhance tracking robustness against identity switches and trajectory fragmentation. Contribution/Results: We identify, for the first time, a systematic bias across mainstream MOT methods: overestimation of interaction frequency and underestimation of TTC—leading to overly pessimistic safety assessments. Error attribution analysis confirms that this bias stems primarily from ID switches and trajectory truncation. Our findings establish a critical benchmark and methodological foundation for roadside perception data fusion and MOT bias correction in safety-critical applications.

Technology Category

Application Category

📝 Abstract

Image-based multi-object detection (MOD) and multi-object tracking (MOT) are advancing at a fast pace. A variety of 2D and 3D MOD and MOT methods have been developed for monocular and stereo cameras. Road safety analysis can benefit from those advancements. As crashes are rare events, surrogate measures of safety (SMoS) have been developed for safety analyses. (Semi-)Automated safety analysis methods extract road user trajectories to compute safety indicators, for example, Time-to-Collision (TTC) and Post-encroachment Time (PET). Inspired by the success of deep learning in MOD and MOT, we investigate three MOT methods, including one based on a stereo-camera, using the annotated KITTI traffic video dataset. Two post-processing steps, IDsplit and SS, are developed to improve the tracking results and investigate the factors influencing the TTC. The experimental results show that, despite some advantages in terms of the numbers of interactions or similarity to the TTC distributions, all the tested methods systematically over-estimate the number of interactions and under-estimate the TTC: they report more interactions and more severe interactions, making the road user interactions appear less safe than they are. Further efforts will be directed towards testing more methods and more data, in particular from roadside sensors, to verify the results and improve the performance.

Problem

Research questions and friction points this paper is trying to address.

Evaluate deep learning for automated road safety analysis using video data.

Assess multi-object detection and tracking methods for safety indicators.

Improve accuracy of safety measures like Time-to-Collision using advanced tracking.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes deep learning for multi-object tracking.

Implements stereo-camera for enhanced trajectory analysis.

Develops post-processing steps to refine safety indicators.

🔎 Similar Papers

Urban Safety Perception Assessments via Integrating Multimodal Large Language Models with Street View Images