Nested Unfolding Network for Real-World Concealed Object Segmentation

πŸ“… 2025-11-22
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Existing deep unfolding network (DUN)-based methods for camouflaged object segmentation (COS) suffer from two key limitations: (1) entangled background estimation and image restoration leading to task interference, and (2) reliance on predefined degradation models, limiting generalizability to real-world scenarios. To address these, we propose the Nested Unfolding Network (NUN), the first DUN-in-DUN architecture that decouples and jointly optimizes restoration and segmentation. NUN integrates vision-language model–guided semantic degradation reasoning with no-reference image quality assessment for prior-free, adaptive restoration. Furthermore, it introduces a reversible foreground-background estimation module and a self-consistency loss to enforce structural coherence. Extensive experiments demonstrate state-of-the-art performance on both clean and degraded image benchmarks, significantly improving segmentation robustness and accuracy under complex real-world conditions.

Technology Category

Application Category

πŸ“ Abstract
Deep unfolding networks (DUNs) have recently advanced concealed object segmentation (COS) by modeling segmentation as iterative foreground-background separation. However, existing DUN-based methods (RUN) inherently couple background estimation with image restoration, leading to conflicting objectives and requiring pre-defined degradation types, which are unrealistic in real-world scenarios. To address this, we propose the nested unfolding network (NUN), a unified framework for real-world COS. NUN adopts a DUN-in-DUN design, embedding a degradation-resistant unfolding network (DeRUN) within each stage of a segmentation-oriented unfolding network (SODUN). This design decouples restoration from segmentation while allowing mutual refinement. Guided by a vision-language model (VLM), DeRUN dynamically infers degradation semantics and restores high-quality images without explicit priors, whereas SODUN performs reversible estimation to refine foreground and background. Leveraging the multi-stage nature of unfolding, NUN employs image-quality assessment to select the best DeRUN outputs for subsequent stages, naturally introducing a self-consistency loss that enhances robustness. Extensive experiments show that NUN achieves a leading place on both clean and degraded benchmarks. Code will be released.
Problem

Research questions and friction points this paper is trying to address.

Decouples image restoration from concealed object segmentation
Handles unknown real-world degradations without predefined priors
Resolves conflicting objectives in iterative foreground-background separation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nested unfolding network decouples restoration from segmentation
Vision-language model dynamically infers degradation semantics
Multi-stage quality assessment selects optimal restoration outputs
πŸ”Ž Similar Papers
No similar papers found.