🤖 AI Summary
This work addresses pedestrian detection in low-light/nighttime scenarios via visible-light and far-infrared (FIR) image fusion. Methodologically, it presents the first systematic taxonomy of multimodal detection approaches—including R-CNN–based, YOLO-family, anchor-free, and graph neural network architectures—while explicitly incorporating thermal imaging priors such as temperature contrast and structural sparsity. The study identifies four fundamental challenges: low signal-to-noise ratio, cross-modal misalignment, scarce annotated data, and domain shift; and surveys five major benchmark datasets (e.g., KAIST, CVC-14). Its key contributions include a novel analysis of technical transfer bottlenecks across modalities and a critical assessment of evaluation gaps. The proposed taxonomy provides a reproducible, structured framework for algorithm design and clarifies the evolutionary trajectory of multimodal pedestrian detection research.
📝 Abstract
Pedestrian detection has become a cornerstone for several high-level tasks, including autonomous driving, intelligent transportation, and traffic surveillance. There are several works focussed on pedestrian detection using visible images, mainly in the daytime. However, this task is very intriguing when the environmental conditions change to poor lighting or nighttime. Recently, new ideas have been spurred to use alternative sources, such as Far InfraRed (FIR) temperature sensor feeds for detecting pedestrians in low-light conditions. This study reviews recent developments in low-light pedestrian detection approaches. It systematically categorizes and analyses various algorithms from region-based to non-region-based and graph-based learning methodologies by highlighting their methodologies, implementation issues, and challenges. It also outlines the key benchmark datasets that can be used for research and development of advanced pedestrian detection algorithms, particularly in low-light situations.