Weakly Supervised Camouflaged Object Detection Based on the SAM Model and Mask Guidance

📅 2026-05-24

📈 Citations: 0

✨ Influential: 0

career value

172K/year

🤖 AI Summary

This work addresses the challenges of ambiguous boundaries and missed detections in weakly supervised camouflaged object detection caused by coarse bounding box annotations. To tackle these issues, the authors propose MGNet, a novel framework that innovatively integrates the Segment Anything Model (SAM) with weakly supervised learning through a BoxSAM strategy to generate high-quality pixel-level pseudo-labels. The architecture further incorporates a cascaded mask decoder, a Context Enhancement Module (CEM), and a Mask-Guided Feature Aggregation Module (MFAM) to enable mask-guided fine-grained segmentation. Experimental results demonstrate that MGNet significantly outperforms existing weakly supervised methods across multiple benchmarks, achieving performance on par with current state-of-the-art approaches.

📝 Abstract

Camouflaged object detection (COD) from a single image is a challenging task due to the high similarity between objects and their surroundings. Existing fully supervised methods require labor-intensive pixel-level annotations, making weakly supervised methods a viable compromise that balances accuracy and annotation efficiency. However, weakly supervised methods often experience performance degradation due to the use of coarse annotations. In this paper, we introduce a new weakly supervised approach for camouflaged object detection to overcome these limitations. Specifically, we propose a novel network, MGNet, which tackles edge ambiguity and missed detections by utilizing initial masks generated by our custom-designed Cascaded Mask Decoder (CMD) to guide the segmentation process and enhance edge predictions. We introduce a Context Enhancement Module(CEM) to reduce the missing detection, and a Mask-guided Feature Aggregation Module (MFAM) for effective feature aggregation. For the weak supervision challenge, we propose BoxSAM, which leverages the Segment Anything Model (SAM) with bounding-box prompts to generate pseudo-labels. By employing a redundant processing strategy, high quality pixel-level pseudo-labels are provided for training MGNet. Extensive experiments demonstrate that our method delivers competitive performance against current state-of-the-art methods.

Problem

Research questions and friction points this paper is trying to address.

Camouflaged Object Detection

Weakly Supervised Learning

Pseudo-labels

Annotation Efficiency

Edge Ambiguity

Innovation

Methods, ideas, or system contributions that make the work stand out.

Weakly Supervised Learning

Camouflaged Object Detection

Segment Anything Model (SAM)