SAMamba: Adaptive State Space Modeling with Hierarchical Vision for Infrared Small Target Detection

📅 2025-05-29

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

Infrared small target detection (ISTD) suffers from severe challenges including extremely low target occupancy (<0.15%), low signal-clutter ratio, and complex backgrounds, leading to high false-negative and false-positive rates. To address these issues, we propose the first ISTD framework integrating SAM2’s hierarchical visual representations with Mamba’s selective state-space modeling. Our key innovations include: (1) an FS-Adapter module for domain-adaptive feature alignment to mitigate cross-scene generalization bias; (2) a CSI module for efficient global contextual modeling via long-range dependency capture; and (3) a DPCF module for detail-preserving feature fusion to suppress information loss during downsampling. The framework further incorporates learnable task embeddings, channel-adaptive transformations, and gated multi-scale fusion. Extensive experiments on NUAA-SIRST, IRSTD-1k, and NUDT-SIRST demonstrate consistent superiority over state-of-the-art methods, particularly under heterogeneous backgrounds and multi-scale target scenarios, achieving significant gains in both detection accuracy and robustness.

Technology Category

Application Category

📝 Abstract

Infrared small target detection (ISTD) is vital for long-range surveillance in military, maritime, and early warning applications. ISTD is challenged by targets occupying less than 0.15% of the image and low distinguishability from complex backgrounds. Existing deep learning methods often suffer from information loss during downsampling and inefficient global context modeling. This paper presents SAMamba, a novel framework integrating SAM2's hierarchical feature learning with Mamba's selective sequence modeling. Key innovations include: (1) A Feature Selection Adapter (FS-Adapter) for efficient natural-to-infrared domain adaptation via dual-stage selection (token-level with a learnable task embedding and channel-wise adaptive transformations); (2) A Cross-Channel State-Space Interaction (CSI) module for efficient global context modeling with linear complexity using selective state space modeling; and (3) A Detail-Preserving Contextual Fusion (DPCF) module that adaptively combines multi-scale features with a gating mechanism to balance high-resolution and low-resolution feature contributions. SAMamba addresses core ISTD challenges by bridging the domain gap, maintaining fine-grained details, and efficiently modeling long-range dependencies. Experiments on NUAA-SIRST, IRSTD-1k, and NUDT-SIRST datasets show SAMamba significantly outperforms state-of-the-art methods, especially in challenging scenarios with heterogeneous backgrounds and varying target scales. Code: https://github.com/zhengshuchen/SAMamba.

Problem

Research questions and friction points this paper is trying to address.

Detects infrared small targets in complex backgrounds

Reduces information loss during image downsampling

Improves global context modeling efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Feature Selection Adapter for domain adaptation

Cross-Channel State-Space Interaction module

Detail-Preserving Contextual Fusion module

🔎 Similar Papers

Infrared Small Target Detection based on Adjustable Sensitivity Strategy and Multi-Scale Fusion