Assisted Refinement Network Based on Channel Information Interaction for Camouflaged and Salient Object Detection

📅 2025-12-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address two key bottlenecks in camouflaged object detection (COD) and salient object detection (SOD)—insufficient intra-layer channel-wise interaction and difficulty in jointly modeling boundary and region information—this paper proposes a Channel Information Interaction Module (CIIM) and a prior-guided collaborative decoding architecture. CIIM enables cross-channel feature reorganization via horizontal–vertical channel integration, while the decoder employs dual-path prior generation (boundary/region) coupled with hybrid attention-based calibration to jointly optimize structural and semantic cues. Notably, this is the first work to unify COD and SOD under a single framework, demonstrating strong cross-task generalization. Our method achieves state-of-the-art performance on four COD benchmarks. Moreover, it successfully transfers to diverse downstream tasks—including SOD, polyp segmentation, transparent object detection, and industrial defect detection—validating its robustness and versatility. Code and comprehensive experimental results are publicly available.

Technology Category

Application Category

📝 Abstract
Camouflaged Object Detection (COD) stands as a significant challenge in computer vision, dedicated to identifying and segmenting objects visually highly integrated with their backgrounds. Current mainstream methods have made progress in cross-layer feature fusion, but two critical issues persist during the decoding stage. The first is insufficient cross-channel information interaction within the same-layer features, limiting feature expressiveness. The second is the inability to effectively co-model boundary and region information, making it difficult to accurately reconstruct complete regions and sharp boundaries of objects. To address the first issue, we propose the Channel Information Interaction Module (CIIM), which introduces a horizontal-vertical integration mechanism in the channel dimension. This module performs feature reorganization and interaction across channels to effectively capture complementary cross-channel information. To address the second issue, we construct a collaborative decoding architecture guided by prior knowledge. This architecture generates boundary priors and object localization maps through Boundary Extraction (BE) and Region Extraction (RE) modules, then employs hybrid attention to collaboratively calibrate decoded features, effectively overcoming semantic ambiguity and imprecise boundaries. Additionally, the Multi-scale Enhancement (MSE) module enriches contextual feature representations. Extensive experiments on four COD benchmark datasets validate the effectiveness and state-of-the-art performance of the proposed model. We further transferred our model to the Salient Object Detection (SOD) task and demonstrated its adaptability across downstream tasks, including polyp segmentation, transparent object detection, and industrial and road defect detection. Code and experimental results are publicly available at: https://github.com/akuan1234/ARNet-v2.
Problem

Research questions and friction points this paper is trying to address.

Addresses insufficient cross-channel information interaction in camouflaged object detection
Solves inability to co-model boundary and region information for accurate segmentation
Proposes a collaborative decoding architecture with prior knowledge guidance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Channel Information Interaction Module for cross-channel feature reorganization
Collaborative decoding with boundary and region extraction modules
Multi-scale Enhancement module enriches contextual feature representations
🔎 Similar Papers
No similar papers found.
Kuan Wang
Kuan Wang
Tsinghua University; Georgia Institute of Technology
Natural Language ProcessingMachine LearningComputer Vision
Yanjun Qin
Yanjun Qin
Tsinghua University
Traffic ForecastingTransportation mode recognition
M
Mengge Lu
School of Computer Science and Technology, Xinjiang University, Urumqi, 830017, China
L
Liejun Wang
School of Computer Science and Technology, Xinjiang University, Urumqi, 830017, China
Xiaoming Tao
Xiaoming Tao
Tsinghua University
Wireless multimedia communications