FMNet: Frequency-Assisted Mamba-Like Linear Attention Network for Camouflaged Object Detection

📅 2025-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Camouflaged object detection (COD) suffers from severe ambiguity due to high visual similarity between objects and their backgrounds. Existing approaches are either constrained by limited local receptive fields or incur prohibitive computational overhead when incorporating Transformer-based architectures. To address these limitations, we propose a frequency-domain-assisted Mamba-like linear attention network. Our key contribution is the Multi-scale Frequency-domain-assisted Mamba (MFM) module—a novel design integrating discrete cosine transform (DCT) for global frequency prior modeling, Pyramid Frequency-domain Attention Extraction (PFAE), and a Frequency-domain Reverse Decoder (FRD). This enables robust multi-scale feature learning while preserving lightweight state-space modeling. Extensive experiments demonstrate that our method achieves state-of-the-art performance across multiple COD benchmarks, significantly improving both detection accuracy and inference efficiency. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Camouflaged Object Detection (COD) is challenging due to the strong similarity between camouflaged objects and their surroundings, which complicates identification. Existing methods mainly rely on spatial local features, failing to capture global information, while Transformers increase computational costs.To address this, the Frequency-Assisted Mamba-Like Linear Attention Network (FMNet) is proposed, which leverages frequency-domain learning to efficiently capture global features and mitigate ambiguity between objects and the background. FMNet introduces the Multi-Scale Frequency-Assisted Mamba-Like Linear Attention (MFM) module, integrating frequency and spatial features through a multi-scale structure to handle scale variations while reducing computational complexity. Additionally, the Pyramidal Frequency Attention Extraction (PFAE) module and the Frequency Reverse Decoder (FRD) enhance semantics and reconstruct features. Experimental results demonstrate that FMNet outperforms existing methods on multiple COD datasets, showcasing its advantages in both performance and efficiency. Code available at https://anonymous.4open.science/r/FMNet-3CE5.
Problem

Research questions and friction points this paper is trying to address.

Detects camouflaged objects with high similarity to surroundings.
Captures global features efficiently using frequency-domain learning.
Reduces computational complexity while handling scale variations.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Frequency-domain learning for global feature capture
Multi-Scale Frequency-Assisted Mamba-Like Linear Attention module
Pyramidal Frequency Attention Extraction and Frequency Reverse Decoder
🔎 Similar Papers
No similar papers found.
Ming Deng
Ming Deng
上海大学
计算机科学
Sijin Sun
Sijin Sun
Imperial College London
design engineeringHCI
Z
Zihao Li
Shanghai University
X
Xiaochuan Hu
University of Electronic Science and Technology of China
X
Xing Wu
Shanghai University