Multi-Modal Camera-Based Detection of Vulnerable Road Users

📅 2025-09-08
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address degraded detection performance for vulnerable road users (VRUs) under low-light conditions, adverse weather, and class-imbalanced datasets, this paper proposes an RGB-thermal multimodal fusion detection framework. Methodologically, we fine-tune YOLOv8 with three key innovations: (1) RGB-to-thermal cross-modal data augmentation to mitigate bias in scarce VRU categories; (2) a class-weighted loss function to improve recall for underrepresented classes (e.g., cyclists and motorcyclists); and (3) partial backbone freezing to balance accuracy and inference efficiency. Experiments on KITTI, BDD100K, and Teledyne FLIR benchmarks demonstrate that the thermal modality contributes the largest accuracy gain. Our augmentation strategy boosts average VRU recall by 8.3%, while a 640×640 input resolution combined with lightweight enhancements achieves optimal accuracy–efficiency trade-off.

Technology Category

Application Category

📝 Abstract
Vulnerable road users (VRUs) such as pedestrians, cyclists, and motorcyclists represent more than half of global traffic deaths, yet their detection remains challenging in poor lighting, adverse weather, and unbalanced data sets. This paper presents a multimodal detection framework that integrates RGB and thermal infrared imaging with a fine-tuned YOLOv8 model. Training leveraged KITTI, BDD100K, and Teledyne FLIR datasets, with class re-weighting and light augmentations to improve minority-class performance and robustness, experiments show that 640-pixel resolution and partial backbone freezing optimise accuracy and efficiency, while class-weighted losses enhance recall for rare VRUs. Results highlight that thermal models achieve the highest precision, and RGB-to-thermal augmentation boosts recall, demonstrating the potential of multimodal detection to improve VRU safety at intersections.
Problem

Research questions and friction points this paper is trying to address.

Detecting vulnerable road users in poor lighting conditions
Addressing data imbalance issues for rare VRU classes
Improving detection robustness during adverse weather scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates RGB and thermal imaging with fine-tuned YOLOv8
Uses class re-weighting and light augmentations for robustness
Optimizes accuracy with 640-pixel resolution and backbone freezing
🔎 Similar Papers
No similar papers found.