Scalpel-SAM: A Semi-Supervised Paradigm for Adapting SAM to Infrared Small Object Detection

📅 2025-12-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the high annotation cost and scarcity of labeled data in infrared small target detection (IRSTD), this paper proposes the first SAM-based semi-supervised detection framework. To bridge the semantic and physical domain gap between SAM and infrared imagery, we design a white-box hierarchical Mixture-of-Experts (MoE) adapter for explicit multi-granularity feature alignment, and introduce a physics-guided knowledge distillation mechanism that incorporates infrared imaging priors to constrain pseudo-label generation and enable two-stage knowledge transfer. Our method achieves performance on par with or surpassing fully supervised baselines using only 10% of fully annotated data, significantly improving detection accuracy and cross-dataset generalization across multiple IRSTD benchmarks. The framework offers an interpretable, lightweight, and deployable paradigm for low-resource infrared perception.

Technology Category

Application Category

📝 Abstract
Infrared small object detection urgently requires semi-supervised paradigms due to the high cost of annotation. However, existing methods like SAM face significant challenges of domain gaps, inability of encoding physical priors, and inherent architectural complexity. To address this, we designed a Hierarchical MoE Adapter consisting of four white-box neural operators. Building upon this core component, we propose a two-stage paradigm for knowledge distillation and transfer: (1) Prior-Guided Knowledge Distillation, where we use our MoE adapter and 10% of available fully supervised data to distill SAM into an expert teacher (Scalpel-SAM); and (2) Deployment-Oriented Knowledge Transfer, where we use Scalpel-SAM to generate pseudo labels for training lightweight and efficient downstream models. Experiments demonstrate that with minimal annotations, our paradigm enables downstream models to achieve performance comparable to, or even surpassing, their fully supervised counterparts. To our knowledge, this is the first semi-supervised paradigm that systematically addresses the data scarcity issue in IR-SOT using SAM as the teacher model.
Problem

Research questions and friction points this paper is trying to address.

Adapts SAM to infrared small object detection with minimal annotations
Addresses domain gaps and architectural complexity in existing methods
Enables efficient downstream models via knowledge distillation and transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical MoE Adapter with white-box neural operators
Two-stage knowledge distillation and transfer paradigm
Generates pseudo labels for efficient downstream models
🔎 Similar Papers
No similar papers found.