SCOUT: Semi-supervised Camouflaged Object Detection by Utilizing Text and Adaptive Data Selection

๐Ÿ“… 2025-08-25
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the high cost of pixel-level annotations in camouflaged object detection (COD), this paper proposes SCOUT, a semi-supervised framework that effectively leverages unlabeled data. Methodologically, SCOUT introduces an Adaptive Data Augmentation and Selection module (ADAS) that integrates adversarial augmentation with dynamic sampling strategies; a Text Fusion Module (TFM) that models cross-modal alignment between textual descriptions and visual features; and establishes RefTextCODโ€”the first COD benchmark dataset annotated with natural-language text descriptions. Extensive experiments demonstrate that SCOUT achieves state-of-the-art performance across multiple benchmarks, significantly outperforming existing semi-supervised COD approaches while drastically reducing reliance on labeled data. By unifying vision-language modeling with semi-supervised learning, SCOUT establishes a novel paradigm for multi-modal semi-supervised COD.

Technology Category

Application Category

๐Ÿ“ Abstract
The difficulty of pixel-level annotation has significantly hindered the development of the Camouflaged Object Detection (COD) field. To save on annotation costs, previous works leverage the semi-supervised COD framework that relies on a small number of labeled data and a large volume of unlabeled data. We argue that there is still significant room for improvement in the effective utilization of unlabeled data. To this end, we introduce a Semi-supervised Camouflaged Object Detection by Utilizing Text and Adaptive Data Selection (SCOUT). It includes an Adaptive Data Augment and Selection (ADAS) module and a Text Fusion Module (TFM). The ADSA module selects valuable data for annotation through an adversarial augment and sampling strategy. The TFM module further leverages the selected valuable data by combining camouflage-related knowledge and text-visual interaction. To adapt to this work, we build a new dataset, namely RefTextCOD. Extensive experiments show that the proposed method surpasses previous semi-supervised methods in the COD field and achieves state-of-the-art performance. Our code will be released at https://github.com/Heartfirey/SCOUT.
Problem

Research questions and friction points this paper is trying to address.

Reducing annotation costs for camouflaged object detection
Improving utilization of unlabeled data in COD
Integrating text and visual information for detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive Data Augment and Selection module
Text Fusion Module with text-visual interaction
Semi-supervised framework using camouflage-related text
๐Ÿ”Ž Similar Papers
No similar papers found.
W
Weiqi Yan
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China.
L
Lvhai Chen
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China.
Shengchuan Zhang
Shengchuan Zhang
Xiamen University
computer visionmachine learning
Y
Yan Zhang
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China.
L
Liujuan Cao
Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China.