FMNet: Frequency-Assisted Mamba-Like Linear Attention Network for Camouflaged Object Detection

📅 2025-03-14

📈 Citations: 0

✨ Influential: 0

career value

244K/year

🤖 AI Summary

Camouflaged object detection (COD) suffers from severe ambiguity due to high visual similarity between objects and their backgrounds. Existing approaches are either constrained by limited local receptive fields or incur prohibitive computational overhead when incorporating Transformer-based architectures. To address these limitations, we propose a frequency-domain-assisted Mamba-like linear attention network. Our key contribution is the Multi-scale Frequency-domain-assisted Mamba (MFM) module—a novel design integrating discrete cosine transform (DCT) for global frequency prior modeling, Pyramid Frequency-domain Attention Extraction (PFAE), and a Frequency-domain Reverse Decoder (FRD). This enables robust multi-scale feature learning while preserving lightweight state-space modeling. Extensive experiments demonstrate that our method achieves state-of-the-art performance across multiple COD benchmarks, significantly improving both detection accuracy and inference efficiency. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

Camouflaged Object Detection (COD) is challenging due to the strong similarity between camouflaged objects and their surroundings, which complicates identification. Existing methods mainly rely on spatial local features, failing to capture global information, while Transformers increase computational costs.To address this, the Frequency-Assisted Mamba-Like Linear Attention Network (FMNet) is proposed, which leverages frequency-domain learning to efficiently capture global features and mitigate ambiguity between objects and the background. FMNet introduces the Multi-Scale Frequency-Assisted Mamba-Like Linear Attention (MFM) module, integrating frequency and spatial features through a multi-scale structure to handle scale variations while reducing computational complexity. Additionally, the Pyramidal Frequency Attention Extraction (PFAE) module and the Frequency Reverse Decoder (FRD) enhance semantics and reconstruct features. Experimental results demonstrate that FMNet outperforms existing methods on multiple COD datasets, showcasing its advantages in both performance and efficiency. Code available at https://anonymous.4open.science/r/FMNet-3CE5.

Problem

Research questions and friction points this paper is trying to address.

Detects camouflaged objects with high similarity to surroundings.

Captures global features efficiently using frequency-domain learning.

Reduces computational complexity while handling scale variations.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Frequency-domain learning for global feature capture

Multi-Scale Frequency-Assisted Mamba-Like Linear Attention module

Pyramidal Frequency Attention Extraction and Frequency Reverse Decoder

🔎 Similar Papers

Shifting Spotlight for Co-supervision: A Simple yet Efficient Single-branch Network to See Through Camouflage