🤖 AI Summary
Infrared small target detection suffers from inaccurate edge localization and target occlusion by background clutter due to extremely small target size and lack of textural cues. To address this, we propose the Gradient-Guided Learning Network (GGL-Net), the first deep learning framework to explicitly incorporate gradient magnitude images. GGL-Net introduces a Gradient Supplement Module (GSM) and a Two-way Guided Fusion Module (TGFM) to enable bidirectional, multi-scale collaboration between semantic and detail features. The architecture integrates dual-branch feature extraction, channel-spatial joint attention, and hierarchical feature interaction. Extensive experiments demonstrate state-of-the-art performance on both the real-world NUAA-SIRST and synthetic NUDT-SIRST benchmarks. The source code is publicly available and has been integrated into the MSDA-Net project.
📝 Abstract
Recently, infrared small target detection has attracted extensive attention. However, due to the small size and the lack of intrinsic features of infrared small targets, the existing methods generally have the problem of inaccurate edge positioning and the target is easily submerged by the background. Therefore, we propose an innovative gradient-guided learning network (GGL-Net). Specifically, we are the first to explore the introduction of gradient magnitude images into the deep learning-based infrared small target detection method, which is conducive to emphasizing the edge details and alleviating the problem of inaccurate edge positioning of small targets. On this basis, we propose a novel dual-branch feature extraction network that utilizes the proposed gradient supplementary module (GSM) to encode raw gradient information into deeper network layers and embeds attention mechanisms reasonably to enhance feature extraction ability. In addition, we construct a two-way guidance fusion module (TGFM), which fully considers the characteristics of feature maps at different levels. It can facilitate the effective fusion of multi-scale feature maps and extract richer semantic information and detailed information through reasonable two-way guidance. Extensive experiments prove that GGL-Net has achieves state-of-the-art results on the public real NUAA-SIRST dataset and the public synthetic NUDT-SIRST dataset. Our code has been integrated into https://github.com/YuChuang1205/MSDA-Net