Visible-Thermal Tiny Object Detection: A Benchmark Dataset and Baselines

📅 2024-06-20
🏛️ arXiv.org
📈 Citations: 4
Influential: 1
📄 PDF
🤖 AI Summary
Existing RGB-T small object detection (SOD) benchmarks suffer from limited scale, low category diversity, unregistered modalities, and insufficient micro-scale instances (<16×16 pixels), hindering fair, multi-class evaluation of tiny targets. Method: We introduce RGBT-Tiny—the first large-scale, high-diversity RGB-T SOD benchmark—comprising 115 sequences, 93K frames, and 1.2M pixel-accurate, ID-annotated bounding boxes, with 81% of objects at micro-scale and rigorously co-registered across modalities. We further propose Scale-Adaptive Fitness (SAFit), a novel evaluation metric that robustly assesses performance across both micro-scale and regular-scale objects. Contribution/Results: We conduct a comprehensive benchmarking study of 23 state-of-the-art methods—including single-modal, dual-modal, generic, and tiny-target-specific approaches—on RGBT-Tiny, establishing new baselines and significantly advancing the field of RGB-T tiny object detection.

Technology Category

Application Category

📝 Abstract
Small object detection (SOD) has been a longstanding yet challenging task for decades, with numerous datasets and algorithms being developed. However, they mainly focus on either visible or thermal modality, while visible-thermal (RGBT) bimodality is rarely explored. Although some RGBT datasets have been developed recently, the insufficient quantity, limited category, misaligned images and large target size cannot provide an impartial benchmark to evaluate multi-category visible-thermal small object detection (RGBT SOD) algorithms. In this paper, we build the first large-scale benchmark with high diversity for RGBT SOD (namely RGBT-Tiny), including 115 paired sequences, 93K frames and 1.2M manual annotations. RGBT-Tiny contains abundant targets (7 categories) and high-diversity scenes (8 types that cover different illumination and density variations). Note that, over 81% of targets are smaller than 16x16, and we provide paired bounding box annotations with tracking ID to offer an extremely challenging benchmark with wide-range applications, such as RGBT fusion, detection and tracking. In addition, we propose a scale adaptive fitness (SAFit) measure that exhibits high robustness on both small and large targets. The proposed SAFit can provide reasonable performance evaluation and promote detection performance. Based on the proposed RGBT-Tiny dataset and SAFit measure, extensive evaluations have been conducted, including 23 recent state-of-the-art algorithms that cover four different types (i.e., visible generic detection, visible SOD, thermal SOD and RGBT object detection). Project is available at https://github.com/XinyiYing/RGBT-Tiny.
Problem

Research questions and friction points this paper is trying to address.

Lack of diverse visible-thermal small object datasets.
Need for robust evaluation metrics for small objects.
Development of a challenging RGBT SOD benchmark.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Developed first large-scale RGBT-Tiny dataset
Introduced scale adaptive fitness measure
Evaluated 23 state-of-the-art algorithms
🔎 Similar Papers
No similar papers found.
Xinyi Ying
Xinyi Ying
National University of Defense Technology
infrared small object detection
C
Chao Xiao
College of Electronic Science and Technology, NUDT, Changsha 410073, China
Ruojing Li
Ruojing Li
National University of Defense Technology
Small Object Detectionvideo understandingdeep learningmachine learning
X
Xu He
College of Electronic Science and Technology, NUDT, Changsha 410073, China
B
Boyang Li
College of Electronic Science and Technology, NUDT, Changsha 410073, China
Z
Zhaoxu Li
College of Electronic Science and Technology, NUDT, Changsha 410073, China
Yingqian Wang
Yingqian Wang
National University of Defense Technology
light fieldimage super-resolution
M
Mingyuan Hu
College of Electronic Science and Technology, NUDT, Changsha 410073, China
Qingyu Xu
Qingyu Xu
Beijing Huairou Laboratory
Power System PlanningClimate PolicyOptimizationStochastic ProgrammingEnergy Policy
Z
Zaiping Lin
College of Electronic Science and Technology, NUDT, Changsha 410073, China
M
Miao Li
College of Electronic Science and Technology, NUDT, Changsha 410073, China
Shilin Zhou
Shilin Zhou
School of Computer Science and Technology, Soochow University
Machine LearningNatural Language Processing
W
Wei An
College of Electronic Science and Technology, NUDT, Changsha 410073, China
W
Weidong Sheng
College of Electronic Science and Technology, NUDT, Changsha 410073, China
L
Li Liu
College of Electronic Science and Technology, NUDT, Changsha 410073, China