IrisNet: Infrared Image Status Awareness Meta Decoder for Infrared Small Targets Detection

📅 2025-11-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Infrared small target detection (IRSTD) suffers from low signal-to-noise ratio, complex heterogeneous backgrounds, and intrinsically weak target features. Conventional encoder-decoder architectures employ static decoder parameters, limiting their adaptability to cross-scenario distribution shifts—e.g., day/night conditions or sky/maritime/terrestrial backgrounds—and thus impairing robustness. To address this, we propose an image-state-aware meta-decoding framework: for the first time, the input image is directly fed into the decoder to dynamically generate decoding parameters via a Transformer; inter-layer dependencies are modeled via self-attention, while cross-attention enables scene-adaptive decoding; high-frequency information is explicitly injected to enhance edge preservation and localization accuracy. Innovatively, we adopt 2D tensorized parameter representation coupled with a meta-learning mechanism to achieve hierarchical feature correlation. Our method achieves state-of-the-art performance on NUDT-SIRST, NUAA-SIRST, and IRSTD-1K, demonstrating significantly improved cross-scenario generalization.

Technology Category

Application Category

📝 Abstract
Infrared Small Target Detection (IRSTD) faces significant challenges due to low signal-to-noise ratios, complex backgrounds, and the absence of discernible target features. While deep learning-based encoder-decoder frameworks have advanced the field, their static pattern learning suffers from pattern drift across diverse scenarios (emph{e.g.}, day/night variations, sky/maritime/ground domains), limiting robustness. To address this, we propose IrisNet, a novel meta-learned framework that dynamically adapts detection strategies to the input infrared image status. Our approach establishes a dynamic mapping between infrared image features and entire decoder parameters via an image-to-decoder transformer. More concretely, we represent the parameterized decoder as a structured 2D tensor preserving hierarchical layer correlations and enable the transformer to model inter-layer dependencies through self-attention while generating adaptive decoding patterns via cross-attention. To further enhance the perception ability of infrared images, we integrate high-frequency components to supplement target-position and scene-edge information. Experiments on NUDT-SIRST, NUAA-SIRST, and IRSTD-1K datasets demonstrate the superiority of our IrisNet, achieving state-of-the-art performance.
Problem

Research questions and friction points this paper is trying to address.

Dynamic adaptation to varying infrared scenarios for robust detection
Overcoming static pattern limitations in encoder-decoder IRSTD frameworks
Enhancing target perception through high-frequency component integration
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic decoder parameter adaptation via transformer
Structured 2D tensor preserving hierarchical layer correlations
High-frequency components integration for enhanced perception
🔎 Similar Papers
No similar papers found.
Xuelin Qian
Xuelin Qian
Northwestern Polytechnical University
computer visionmachine learningmultimedia
J
Jiaming Lu
Northwestern Polytechnical University
Z
Zixuan Wang
Northwestern Polytechnical University
W
Wenxuan Wang
Northwestern Polytechnical University
Zhongling Huang
Zhongling Huang
Northwestern Polytechnical University
D
Dingwen Zhang
Northwestern Polytechnical University
J
Junwei Han
Northwestern Polytechnical University