Language-Guided Structure-Aware Network for Camouflaged Object Detection

📅 2026-03-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the challenge of camouflaged object segmentation, where targets are difficult to discern due to their high similarity in color, texture, and structure with the background. To this end, we propose a language-guided structure-aware network that, for the first time, incorporates textual prompts into camouflaged object detection. Leveraging semantic priors derived from CLIP, our method directs multi-scale visual features toward potential target regions. We further introduce a Fourier edge enhancement module, a structure-aware attention mechanism, and a coarse-to-fine local refinement module to strengthen the model’s perception of object structures and boundaries. Built upon the PVT-v2 backbone and integrating frequency-domain high-pass filtering with multi-scale feature fusion, the proposed approach achieves state-of-the-art performance across multiple COD benchmarks, significantly improving both segmentation accuracy and boundary completeness.

Technology Category

Application Category

📝 Abstract

Camouflaged Object Detection (COD) aims to segment objects that are highly integrated with the background in terms of color, texture, and structure, making it a highly challenging task in computer vision. Although existing methods introduce multi-scale fusion and attention mechanisms to alleviate the above issues, they generally lack the guidance of textual semantic priors, which limits the model's ability to focus on camouflaged regions in complex scenes. To address this issue, this paper proposes a Language-Guided Structure-Aware Network (LGSAN). Specifically, based on the visual backbone PVT-v2, we introduce CLIP to generate masks from text prompts and RGB images, thereby guiding the multi-scale features extracted by PVT-v2 to focus on potential target regions. On this foundation, we further design a Fourier Edge Enhancement Module (FEEM), which integrates multi-scale features with high-frequency information in the frequency domain to extract edge enhancement features. Furthermore, we propose a Structure-Aware Attention Module (SAAM) to effectively enhance the model's perception of object structures and boundaries. Finally, we introduce a Coarse-Guided Local Refinement Module (CGLRM) to enhance fine-grained reconstruction and boundary integrity of camouflaged object regions. Extensive experiments demonstrate that our method consistently achieves highly competitive performance across multiple COD datasets, validating its effectiveness and robustness.

Problem

Research questions and friction points this paper is trying to address.

Camouflaged Object Detection

textual semantic priors

object structure perception

boundary integrity

complex scenes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Language-Guided

Structure-Aware

Fourier Edge Enhancement

Camouflaged Object Detection

CLIP

🔎 Similar Papers

No similar papers found.

Authors to Follow