🤖 AI Summary
This work addresses the challenge of segmenting visually entangled objects—such as camouflaged, transparent, or defective targets—in RGB images. Inspired by Retinex theory, the authors decompose an image into illumination and reflectance components to enhance foreground-background discriminability within a unified spatial domain. They introduce the “Discriminability Gap Theorem,” which theoretically guarantees that this decomposition preserves or even improves discriminative capacity across diverse concealed scenarios. Building upon this insight, they design a task-driven Retinex decomposition module, a discriminability gap attention mechanism, and a camouflage-breaking contrastive loss operating in the reflectance feature space. Extensive experiments on multiple concealed object segmentation subtasks demonstrate significant performance gains, validating the effectiveness and generalizability of their homogeneous decomposition strategy.
📝 Abstract
Concealed Object Segmentation (COS) encompasses a family of dense-prediction tasks, including camouflaged object detection, polyp segmentation, transparent object detection, and industrial defect inspection, where targets are visually entangled with their surroundings through different physical mechanisms. Existing methods either operate directly on RGB images or employ \emph{heterogeneous} decompositions (\eg, Fourier, wavelet) that redistribute spatial evidence across scale/frequency coefficients, making pixel-aligned cues less direct. We introduce a fundamentally different perspective: \textbf{homogeneous image decomposition} via Retinex theory, which factorizes an image into illumination and reflectance components within the \emph{same} spatial domain. Our key insight is that visual entanglement enforces appearance matching in the composite space, but this does \emph{not} necessitate simultaneous matching in both component spaces, a phenomenon we formalize as the \textbf{Discriminability Gap Theorem}. Crucially, we show that across diverse COS sub-tasks, the underlying physical processes systematically anti-correlate illumination and reflectance differences, yielding theoretical guarantees that Retinex decomposition preserves or strictly improves total foreground--background discriminability across the full physical regime, with anti-correlation maximizing the gain. Building on this, we propose \textbf{RIDE} comprising: (i) a Task-Driven Retinex Decomposition module that learns segmentation-optimal factorizations end-to-end; (ii) a Discriminability Gap Attention mechanism that adaptively exploits where decomposition helps; and (iii) a Camouflage-Breaking Contrastive loss operating in reflectance feature space.