Rethinking Generalizable Infrared Small Target Detection: A Real-scene Benchmark and Cross-view Representation Learning

πŸ“… 2025-04-23
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the poor generalization of infrared small target detection (ISTD) across sensors, environments, and noisy conditions, this paper proposes a domain-adaptive framework. Methodologically, it introduces cross-view channel alignment (CCA) β€” the first of its kind β€” coupled with a Top-K feature fusion mechanism to achieve robust multi-view feature alignment; designs a noise-guided representation learning strategy to enhance model robustness against imaging noise; and constructs RealScene-ISTD, the first benchmark dataset dedicated to realistic-scenario generalization evaluation. Extensive experiments demonstrate that the proposed method consistently outperforms state-of-the-art approaches in detection probability (Pd), false alarm rate (Fa), and intersection-over-union (IoU). Notably, it achieves significant gains in cross-domain generalization and resilience to noise interference, validating its effectiveness for practical deployment under diverse and challenging real-world conditions.

Technology Category

Application Category

πŸ“ Abstract
Infrared small target detection (ISTD) is highly sensitive to sensor type, observation conditions, and the intrinsic properties of the target. These factors can introduce substantial variations in the distribution of acquired infrared image data, a phenomenon known as domain shift. Such distribution discrepancies significantly hinder the generalization capability of ISTD models across diverse scenarios. To tackle this challenge, this paper introduces an ISTD framework enhanced by domain adaptation. To alleviate distribution shift between datasets and achieve cross-sample alignment, we introduce Cross-view Channel Alignment (CCA). Additionally, we propose the Cross-view Top-K Fusion strategy, which integrates target information with diverse background features, enhancing the model' s ability to extract critical data characteristics. To further mitigate the impact of noise on ISTD, we develop a Noise-guided Representation learning strategy. This approach enables the model to learn more noise-resistant feature representations, to improve its generalization capability across diverse noisy domains. Finally, we develop a dedicated infrared small target dataset, RealScene-ISTD. Compared to state-of-the-art methods, our approach demonstrates superior performance in terms of detection probability (Pd), false alarm rate (Fa), and intersection over union (IoU). The code is available at: https://github.com/luy0222/RealScene-ISTD.
Problem

Research questions and friction points this paper is trying to address.

Address domain shift in infrared small target detection
Improve cross-scenario generalization of detection models
Reduce noise impact on target detection accuracy
Innovation

Methods, ideas, or system contributions that make the work stand out.

Cross-view Channel Alignment for domain adaptation
Cross-view Top-K Fusion for feature integration
Noise-guided Representation learning for robustness
πŸ”Ž Similar Papers
No similar papers found.
Yahao Lu
Yahao Lu
Guangdong University of Technology
Infrared small target detection3D target detectionTransformerDiffusion.
Y
Yuehui Li
School of Information Engineering, Guangdong University of Technology, Guangzhou, 510006, China
X
Xingyuan Guo
Southern Power Grid, Ltd., Guangzhou, 510000, China
S
Shuai Yuan
Xi’an Key Laboratory of Infrared Technology and System, School of Optoelectronic Engineering, Xidian University, Xi’an 710071, China
Y
Yukai Shi
School of Information Engineering, Guangdong University of Technology, Guangzhou, 510006, China
Liang Lin
Liang Lin
Fellow of IEEE/IAPR, Professor of Computer Science, Sun Yat-sen University
Embodied AICausal Inference and LearningMultimodal Data Analysis