IrisNet: Infrared Image Status Awareness Meta Decoder for Infrared Small Targets Detection

📅 2025-11-25

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Infrared small target detection (IRSTD) suffers from low signal-to-noise ratio, complex heterogeneous backgrounds, and intrinsically weak target features. Conventional encoder-decoder architectures employ static decoder parameters, limiting their adaptability to cross-scenario distribution shifts—e.g., day/night conditions or sky/maritime/terrestrial backgrounds—and thus impairing robustness. To address this, we propose an image-state-aware meta-decoding framework: for the first time, the input image is directly fed into the decoder to dynamically generate decoding parameters via a Transformer; inter-layer dependencies are modeled via self-attention, while cross-attention enables scene-adaptive decoding; high-frequency information is explicitly injected to enhance edge preservation and localization accuracy. Innovatively, we adopt 2D tensorized parameter representation coupled with a meta-learning mechanism to achieve hierarchical feature correlation. Our method achieves state-of-the-art performance on NUDT-SIRST, NUAA-SIRST, and IRSTD-1K, demonstrating significantly improved cross-scenario generalization.

Technology Category

Application Category

📝 Abstract

Infrared Small Target Detection (IRSTD) faces significant challenges due to low signal-to-noise ratios, complex backgrounds, and the absence of discernible target features. While deep learning-based encoder-decoder frameworks have advanced the field, their static pattern learning suffers from pattern drift across diverse scenarios (emph{e.g.}, day/night variations, sky/maritime/ground domains), limiting robustness. To address this, we propose IrisNet, a novel meta-learned framework that dynamically adapts detection strategies to the input infrared image status. Our approach establishes a dynamic mapping between infrared image features and entire decoder parameters via an image-to-decoder transformer. More concretely, we represent the parameterized decoder as a structured 2D tensor preserving hierarchical layer correlations and enable the transformer to model inter-layer dependencies through self-attention while generating adaptive decoding patterns via cross-attention. To further enhance the perception ability of infrared images, we integrate high-frequency components to supplement target-position and scene-edge information. Experiments on NUDT-SIRST, NUAA-SIRST, and IRSTD-1K datasets demonstrate the superiority of our IrisNet, achieving state-of-the-art performance.

Problem

Research questions and friction points this paper is trying to address.

Dynamic adaptation to varying infrared scenarios for robust detection

Overcoming static pattern limitations in encoder-decoder IRSTD frameworks

Enhancing target perception through high-frequency component integration

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic decoder parameter adaptation via transformer

Structured 2D tensor preserving hierarchical layer correlations

High-frequency components integration for enhanced perception

🔎 Similar Papers

Infrared Small Target Detection based on Adjustable Sensitivity Strategy and Multi-Scale Fusion

2024-07-29arXiv.orgCitations: 4