SCOUT: Semi-supervised Camouflaged Object Detection by Utilizing Text and Adaptive Data Selection

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

To address the high cost of pixel-level annotations in camouflaged object detection (COD), this paper proposes SCOUT, a semi-supervised framework that effectively leverages unlabeled data. Methodologically, SCOUT introduces an Adaptive Data Augmentation and Selection module (ADAS) that integrates adversarial augmentation with dynamic sampling strategies; a Text Fusion Module (TFM) that models cross-modal alignment between textual descriptions and visual features; and establishes RefTextCOD—the first COD benchmark dataset annotated with natural-language text descriptions. Extensive experiments demonstrate that SCOUT achieves state-of-the-art performance across multiple benchmarks, significantly outperforming existing semi-supervised COD approaches while drastically reducing reliance on labeled data. By unifying vision-language modeling with semi-supervised learning, SCOUT establishes a novel paradigm for multi-modal semi-supervised COD.

Technology Category

Application Category

📝 Abstract

The difficulty of pixel-level annotation has significantly hindered the development of the Camouflaged Object Detection (COD) field. To save on annotation costs, previous works leverage the semi-supervised COD framework that relies on a small number of labeled data and a large volume of unlabeled data. We argue that there is still significant room for improvement in the effective utilization of unlabeled data. To this end, we introduce a Semi-supervised Camouflaged Object Detection by Utilizing Text and Adaptive Data Selection (SCOUT). It includes an Adaptive Data Augment and Selection (ADAS) module and a Text Fusion Module (TFM). The ADSA module selects valuable data for annotation through an adversarial augment and sampling strategy. The TFM module further leverages the selected valuable data by combining camouflage-related knowledge and text-visual interaction. To adapt to this work, we build a new dataset, namely RefTextCOD. Extensive experiments show that the proposed method surpasses previous semi-supervised methods in the COD field and achieves state-of-the-art performance. Our code will be released at https://github.com/Heartfirey/SCOUT.

Problem

Research questions and friction points this paper is trying to address.

Reducing annotation costs for camouflaged object detection

Improving utilization of unlabeled data in COD

Integrating text and visual information for detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Data Augment and Selection module

Text Fusion Module with text-visual interaction

Semi-supervised framework using camouflage-related text

🔎 Similar Papers

Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage