Rethinking Evaluation of Infrared Small Target Detection

📅 2025-09-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing evaluation of infrared small target detection (IRSTD) suffers from three key limitations: (1) fragmented metrics, (2) neglect of failure mode analysis, and (3) overreliance on dataset-specific benchmarks—undermining robustness and generalization assessment. To address these, we propose a hierarchical hybrid evaluation framework comprising: (1) unified quantification by integrating pixel-level and object-level metrics; (2) an interpretable error decomposition mechanism that systematically identifies failure modes—including missed detections, false alarms, and localization errors; and (3) a cross-dataset transfer evaluation paradigm to rigorously validate generalization capability. We further release an open-source, standardized evaluation toolkit. Our framework significantly enhances evaluation transparency, granularity, and practical guidance, establishing a reproducible, comparable, and principled benchmark for developing robust and efficient IRSTD models.

Technology Category

Application Category

📝 Abstract
As an essential vision task, infrared small target detection (IRSTD) has seen significant advancements through deep learning. However, critical limitations in current evaluation protocols impede further progress. First, existing methods rely on fragmented pixel- and target-level specific metrics, which fails to provide a comprehensive view of model capabilities. Second, an excessive emphasis on overall performance scores obscures crucial error analysis, which is vital for identifying failure modes and improving real-world system performance. Third, the field predominantly adopts dataset-specific training-testing paradigms, hindering the understanding of model robustness and generalization across diverse infrared scenarios. This paper addresses these issues by introducing a hybrid-level metric incorporating pixel- and target-level performance, proposing a systematic error analysis method, and emphasizing the importance of cross-dataset evaluation. These aim to offer a more thorough and rational hierarchical analysis framework, ultimately fostering the development of more effective and robust IRSTD models. An open-source toolkit has be released to facilitate standardized benchmarking.
Problem

Research questions and friction points this paper is trying to address.

Current evaluation protocols for infrared small target detection have critical limitations
Existing metrics fail to provide comprehensive analysis of model capabilities and errors
Dataset-specific training paradigms hinder understanding of model robustness and generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid-level metric combining pixel and target performance
Systematic error analysis method for failure mode identification
Cross-dataset evaluation framework for robustness assessment
🔎 Similar Papers
No similar papers found.