š¤ AI Summary
In camouflaged object detection, low segmentation accuracy arises from high object-background similarity and ambiguous boundaries. To address this, we propose a dual-decoder collaborative architecture: a rich decoder incorporates channel-wise attention to enhance discriminative feature representation, while a refinement decoder integrates spatial attention to precisely recover boundary regions. These decoders achieve complementary optimization through multi-stage feature interaction, overcoming the limitations of single-path decoding. Our method features a lightweight, plug-and-play design compatible with diverse backbone encoders. Evaluated on the four standard COD benchmarks, it consistently surpasses state-of-the-art methods, achieving significant improvements in boundary clarity (+4.2% Fβ) and small-object recall (+6.8% EĻ), demonstrating strong generalization capability.
š Abstract
Camouflaged object detection (COD) aims to generate a fine-grained segmentation map of camouflaged objects hidden in their background. Due to the hidden nature of camouflaged objects, it is essential for the decoder to be tailored to effectively extract proper features of camouflaged objects and extra-carefully generate their complex boundaries. In this paper, we propose a novel architecture that augments the prevalent decoding strategy in COD with Enrich Decoder and Retouch Decoder, which help to generate a fine-grained segmentation map. Specifically, the Enrich Decoder amplifies the channels of features that are important for COD using channel-wise attention. Retouch Decoder further refines the segmentation maps by spatially attending to important pixels, such as the boundary regions. With extensive experiments, we demonstrate that ENTO shows superior performance using various encoders, with the two novel components playing their unique roles that are mutually complementary.