🤖 AI Summary
This work addresses the challenge of weak infrared small target detection, where feature downsampling often leads to loss of subtle target details and insufficient localization accuracy. To overcome this limitation, the authors propose a native-resolution feature extraction and fusion framework that systematically leverages full-resolution features—an approach unprecedented in infrared small target detection. The method integrates a confidence-based efficient token selection mechanism with a multi-scale low-level detail enhancement strategy, significantly reducing computational overhead while substantially improving detection sensitivity and localization precision for faint small targets. Extensive experiments on four public datasets demonstrate state-of-the-art performance, confirming the robustness and effectiveness of the proposed framework.
📝 Abstract
Infrared small target detection (IRSTD) faces the inherent challenge of precisely localizing dim targets amid complex background clutter. While progress has been made, existing methods usually follow conventional strategies to downsample features and discard small targets' details, resulting in suboptimal performance. In this paper, we present Na-IRSTD, a native-resolution feature extraction and fusion framework for IRSTD. This framework elegantly incorporates native-resolution features to preserve subtle target cues, overcoming the resolution limitations of existing infrared approaches and significantly improving the model's ability to localize small targets. We also introduce an effective token reduction and selection strategy, which selects target patches with high accuracy and confidence, boosting the low-level details of the feature while effectively reducing native-resolution patch tokens compared to dense processing, thereby avoiding imposing an unbearable computational burden. Extensive experiments demonstrate the robustness and effectiveness of our token reduction and selection strategy across multiple public datasets. Ultimately, our Na-IRSTD model achieves state-of-the-art performance on four benchmarks.