Nested Unfolding Network for Real-World Concealed Object Segmentation

📅 2025-11-22

📈 Citations: 0

✨ Influential: 0

career value

210K/year

🤖 AI Summary

Existing deep unfolding network (DUN)-based methods for camouflaged object segmentation (COS) suffer from two key limitations: (1) entangled background estimation and image restoration leading to task interference, and (2) reliance on predefined degradation models, limiting generalizability to real-world scenarios. To address these, we propose the Nested Unfolding Network (NUN), the first DUN-in-DUN architecture that decouples and jointly optimizes restoration and segmentation. NUN integrates vision-language model–guided semantic degradation reasoning with no-reference image quality assessment for prior-free, adaptive restoration. Furthermore, it introduces a reversible foreground-background estimation module and a self-consistency loss to enforce structural coherence. Extensive experiments demonstrate state-of-the-art performance on both clean and degraded image benchmarks, significantly improving segmentation robustness and accuracy under complex real-world conditions.

Technology Category

Application Category

📝 Abstract

Deep unfolding networks (DUNs) have recently advanced concealed object segmentation (COS) by modeling segmentation as iterative foreground-background separation. However, existing DUN-based methods (RUN) inherently couple background estimation with image restoration, leading to conflicting objectives and requiring pre-defined degradation types, which are unrealistic in real-world scenarios. To address this, we propose the nested unfolding network (NUN), a unified framework for real-world COS. NUN adopts a DUN-in-DUN design, embedding a degradation-resistant unfolding network (DeRUN) within each stage of a segmentation-oriented unfolding network (SODUN). This design decouples restoration from segmentation while allowing mutual refinement. Guided by a vision-language model (VLM), DeRUN dynamically infers degradation semantics and restores high-quality images without explicit priors, whereas SODUN performs reversible estimation to refine foreground and background. Leveraging the multi-stage nature of unfolding, NUN employs image-quality assessment to select the best DeRUN outputs for subsequent stages, naturally introducing a self-consistency loss that enhances robustness. Extensive experiments show that NUN achieves a leading place on both clean and degraded benchmarks. Code will be released.

Problem

Research questions and friction points this paper is trying to address.

Decouples image restoration from concealed object segmentation

Handles unknown real-world degradations without predefined priors

Resolves conflicting objectives in iterative foreground-background separation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Nested unfolding network decouples restoration from segmentation

Vision-language model dynamically infers degradation semantics

Multi-stage quality assessment selects optimal restoration outputs

🔎 Similar Papers

No similar papers found.