π€ AI Summary
This work addresses the challenge of infrared small target detection, where conventional encoder-decoder architectures with pixel-level supervision are limited by the targetsβ extremely small spatial extent and ambiguous boundaries. The paper reformulates the task as a centroid regression problem and introduces SPIRE (Single-Point-supervised Infrared Probability Response Encoding), an end-to-end, decoder-free framework based solely on an encoder. Key innovations include Point Response Prior Supervision (PRPS), a High-Resolution Probability Encoder (HRPE), and a mechanism that transforms single-point annotations into probabilistic response maps. Evaluated on benchmarks such as SIRST-UAVB and SIRST4, SPIRE achieves superior target-level detection performance while significantly reducing both false alarm rates and computational overhead.
π Abstract
Infrared small target detection (IRSTD) aims to separate small targets from clutter backgrounds. Extensive research is dedicated to the pixel-level supervision-guided "encoder-decoder" segmentation paradigm. Although having achieved promising performance, they neglect the fact that small targets only occupy a few pixels and are usually accompanied with blurred boundary caused by clutter backgrounds. Based on this observation, we argue that the first principle of IRSTD should be target localization instead of separating all target region accompanied with indistinguishable background noise. In this paper, we reformulate IRSTD as a centroid regression task and propose a novel Single-Point Supervision guided Infrared Probabilistic Response Encoding method (namely, SPIRE), which is indeed challenging due to the mismatch between reduced supervision network and equivalent output. Specifically, we first design a Point-Response Prior Supervision (PRPS), which transforms single-point annotations into probabilistic response map consistent with infrared point-target response characteristics, with a High-Resolution Probabilistic Encoder (HRPE) that enables encoder-only, end-to-end regression without decoder reconstruction. By preserving high-resolution features and increasing effective supervision density, SPIRE alleviates optimization instability under sparse target distributions. Finally, extensive experiments on various IRSTD benchmarks, including SIRST-UAVB and SIRST4 demonstrate that SPIRE achieves competitive target-level detection performance with consistently low false alarm rate (Fa) and significantly reduced computational cost. Code is publicly available at: https://github.com/NIRIXIANG/SPIRE-IRSTD.