🤖 AI Summary
Camouflaged object detection (COD) suffers from severe ambiguity due to high visual similarity between objects and their backgrounds. Existing approaches are either constrained by limited local receptive fields or incur prohibitive computational overhead when incorporating Transformer-based architectures. To address these limitations, we propose a frequency-domain-assisted Mamba-like linear attention network. Our key contribution is the Multi-scale Frequency-domain-assisted Mamba (MFM) module—a novel design integrating discrete cosine transform (DCT) for global frequency prior modeling, Pyramid Frequency-domain Attention Extraction (PFAE), and a Frequency-domain Reverse Decoder (FRD). This enables robust multi-scale feature learning while preserving lightweight state-space modeling. Extensive experiments demonstrate that our method achieves state-of-the-art performance across multiple COD benchmarks, significantly improving both detection accuracy and inference efficiency. The source code is publicly available.
📝 Abstract
Camouflaged Object Detection (COD) is challenging due to the strong similarity between camouflaged objects and their surroundings, which complicates identification. Existing methods mainly rely on spatial local features, failing to capture global information, while Transformers increase computational costs.To address this, the Frequency-Assisted Mamba-Like Linear Attention Network (FMNet) is proposed, which leverages frequency-domain learning to efficiently capture global features and mitigate ambiguity between objects and the background. FMNet introduces the Multi-Scale Frequency-Assisted Mamba-Like Linear Attention (MFM) module, integrating frequency and spatial features through a multi-scale structure to handle scale variations while reducing computational complexity. Additionally, the Pyramidal Frequency Attention Extraction (PFAE) module and the Frequency Reverse Decoder (FRD) enhance semantics and reconstruct features. Experimental results demonstrate that FMNet outperforms existing methods on multiple COD datasets, showcasing its advantages in both performance and efficiency. Code available at https://anonymous.4open.science/r/FMNet-3CE5.