🤖 AI Summary
Existing covert visual perception (CVP) methods are largely confined to invertible modeling in the mask domain, overlooking the potential of joint optimization in the RGB domain. This work formulates CVP as a dual-domain joint optimization problem and proposes RUN++, the first framework enabling unified invertible modeling across both mask and RGB domains. It introduces a Bernoulli diffusion refinement mechanism to explicitly rectify uncertain regions, and establishes a robust two-level optimization paradigm integrating an invertible unfolding network with three specialized modules: CORE (for mask-domain invertible modeling), CARE (for context-aware enhancement), and FINE (for noise-driven iterative optimization). Experiments demonstrate that RUN++ significantly reduces segmentation false positives and false negatives, maintains robustness under realistic degradation conditions, achieves superior detail recovery compared to full-image generative approaches, and incurs lower computational overhead.
📝 Abstract
Existing methods for concealed visual perception (CVP) often leverage reversible strategies to decrease uncertainty, yet these are typically confined to the mask domain, leaving the potential of the RGB domain underexplored. To address this, we propose a reversible unfolding network with generative refinement, termed RUN++. Specifically, RUN++ first formulates the CVP task as a mathematical optimization problem and unfolds the iterative solution into a multi-stage deep network. This approach provides a principled way to apply reversible modeling across both mask and RGB domains while leveraging a diffusion model to resolve the resulting uncertainty. Each stage of the network integrates three purpose-driven modules: a Concealed Object Region Extraction (CORE) module applies reversible modeling to the mask domain to identify core object regions; a Context-Aware Region Enhancement (CARE) module extends this principle to the RGB domain to foster better foreground-background separation; and a Finetuning Iteration via Noise-based Enhancement (FINE) module provides a final refinement. The FINE module introduces a targeted Bernoulli diffusion model that refines only the uncertain regions of the segmentation mask, harnessing the generative power of diffusion for fine-detail restoration without the prohibitive computational cost of a full-image process. This unique synergy, where the unfolding network provides a strong uncertainty prior for the diffusion model, allows RUN++ to efficiently direct its focus toward ambiguous areas, significantly mitigating false positives and negatives. Furthermore, we introduce a new paradigm for building robust CVP systems that remain effective under real-world degradations and extend this concept into a broader bi-level optimization framework.