MSRNet: A Multi-Scale Recursive Network for Camouflaged Object Detection

📅 2025-11-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Detecting and segmenting camouflaged objects—especially small-sized, multi-instance, low-illumination, heavily occluded, or background-cluttered ones—remains challenging in complex scenes. To address this, we propose a multi-scale recursive network based on a pyramid vision Transformer. Methodologically, we design an attention-driven scale integration unit and a multi-granularity feature fusion decoder, augmented with a recursive feedback mechanism to strengthen global contextual modeling. Our key contributions are: (1) the first incorporation of a recursive structure into the camouflage detection decoding path; and (2) synergistic optimization of features across scales, granularities, and iterations. Evaluated on four mainstream camouflaged object detection benchmarks, our method achieves state-of-the-art performance on two datasets and ranks second on the other two. It notably improves detection recall and segmentation accuracy for small and multiple camouflaged objects.

Technology Category

Application Category

📝 Abstract
Camouflaged object detection is an emerging and challenging computer vision task that requires identifying and segmenting objects that blend seamlessly into their environments due to high similarity in color, texture, and size. This task is further complicated by low-light conditions, partial occlusion, small object size, intricate background patterns, and multiple objects. While many sophisticated methods have been proposed for this task, current methods still struggle to precisely detect camouflaged objects in complex scenarios, especially with small and multiple objects, indicating room for improvement. We propose a Multi-Scale Recursive Network that extracts multi-scale features via a Pyramid Vision Transformer backbone and combines them via specialized Attention-Based Scale Integration Units, enabling selective feature merging. For more precise object detection, our decoder recursively refines features by incorporating Multi-Granularity Fusion Units. A novel recursive-feedback decoding strategy is developed to enhance global context understanding, helping the model overcome the challenges in this task. By jointly leveraging multi-scale learning and recursive feature optimization, our proposed method achieves performance gains, successfully detecting small and multiple camouflaged objects. Our model achieves state-of-the-art results on two benchmark datasets for camouflaged object detection and ranks second on the remaining two. Our codes, model weights, and results are available at href{https://github.com/linaagh98/MSRNet}{https://github.com/linaagh98/MSRNet}.
Problem

Research questions and friction points this paper is trying to address.

Detecting camouflaged objects blending into backgrounds
Handling small and multiple objects in complex scenarios
Improving precision in low-light and occluded conditions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-scale features extracted via Pyramid Vision Transformer
Attention-based scale integration for selective feature merging
Recursive-feedback decoding strategy enhances global context understanding
🔎 Similar Papers
No similar papers found.