IP-SAM: Prompt-Space Conditioning for Prompt-Absent Camouflaged Object Detection

📅 2026-03-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge that prompt-based segmentation models like SAM, which rely on explicit spatial prompts, cannot be directly applied to fully automatic camouflaged object detection (COD). To overcome this limitation, the authors propose IP-SAM, the first method to achieve adaptive, external-prompt-free segmentation from a prompt-space perspective. IP-SAM introduces a self-prompt generator (SPG) to extract intrinsic image cues as region anchors and integrates them with a frozen SAM prompt encoder and a LoRA-finetuned image encoder, forming an end-to-end automatic segmentation framework. Additionally, a prompt-space gating (PSG) mechanism is designed to suppress background false positives. While preserving SAM’s prompt interface integrity, IP-SAM achieves state-of-the-art performance on four COD benchmarks—e.g., MAE = 0.017 on COD10K—with only 21.26M trainable parameters and demonstrates strong zero-shot transfer capability for medical polyp segmentation.
📝 Abstract
Prompt-conditioned foundation segmenters have emerged as a dominant paradigm for image segmentation, where explicit spatial prompts (e.g., points, boxes, masks) guide mask decoding. However, many real-world deployments require fully automatic segmentation, creating a structural mismatch: the decoder expects prompts that are unavailable at inference. Existing adaptations typically modify intermediate features, inadvertently bypassing the model's native prompt interface and weakening prompt-conditioned decoding. We propose IP-SAM, which revisits adaptation from a prompt-space perspective through prompt-space conditioning. Specifically, a Self-Prompt Generator (SPG) distills image context into complementary intrinsic prompts that serve as coarse regional anchors. These cues are projected through SAM2's frozen prompt encoder, restoring prompt-guided decoding without external intervention. To suppress background-induced false positives, Prompt-Space Gating (PSG) leverages the intrinsic background prompt as an asymmetric suppressive constraint prior to decoding. Under a deterministic no-external-prompt protocol, IP-SAM achieves state-of-the-art performance across four camouflaged object detection benchmarks (e.g., MAE 0.017 on COD10K) with only 21.26M trainable parameters (optimizing SPG, PSG, and a task-specific mask decoder trained from scratch, alongside image-encoder LoRA while keeping the prompt encoder frozen). Furthermore, the proposed conditioning strategy generalizes beyond COD to medical polyp segmentation, where a model trained solely on Kvasir-SEG exhibits strong zero-shot transfer to both CVC-ClinicDB and ETIS.
Problem

Research questions and friction points this paper is trying to address.

camouflaged object detection
prompt-free segmentation
automatic segmentation
foundation segmenters
prompt-conditioned decoding
Innovation

Methods, ideas, or system contributions that make the work stand out.

prompt-space conditioning
Self-Prompt Generator
Prompt-Space Gating
prompt-absent segmentation
zero-shot transfer
🔎 Similar Papers
No similar papers found.
H
Huiyao Zhang
University of Chinese Academy of Sciences, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences
J
Jin Bai
University of Chinese Academy of Sciences, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences
R
Rui Guo
University of Chinese Academy of Sciences, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences
J
JianWen Tan
University of Chinese Academy of Sciences, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences
H
HongFei Wang
University of Chinese Academy of Sciences, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences
Ye Li
Ye Li
Professor, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
Body Area NetworkBiomedical Big DataWearable SensorHealth Informatics