🤖 AI Summary
This work addresses the challenge of disentangling transmission and reflection layers in a single image, particularly when nonlinear mixing causes ambiguity and deep decoders suffer from ambiguous feature fusion and insufficient multi-scale coordination. To this end, the authors propose ReflexSplit, a dual-stream framework that incorporates cross-scale gated fusion (CrGF) and a layer fusion-separation block (LFSB) featuring a cross-stream subtraction attention mechanism inspired by differential Transformers. The approach is further enhanced by a progressive curriculum training strategy leveraging depth-dependent initialization and iterative warm starts. Experimental results demonstrate that ReflexSplit achieves state-of-the-art performance on both synthetic and real-world datasets, significantly improving the perceptual quality and generalization capability of the separation results.
📝 Abstract
Single Image Reflection Separation (SIRS) disentangles mixed images into transmission and reflection layers. Existing methods suffer from transmission-reflection confusion under nonlinear mixing, particularly in deep decoder layers, due to implicit fusion mechanisms and inadequate multi-scale coordination. We propose ReflexSplit, a dual-stream framework with three key innovations. (1) Cross-scale Gated Fusion (CrGF) adaptively aggregates semantic priors, texture details, and decoder context across hierarchical depths, stabilizing gradient flow and maintaining feature consistency. (2) Layer Fusion-Separation Blocks (LFSB) alternate between fusion for shared structure extraction and differential separation for layer-specific disentanglement. Inspired by Differential Transformer, we extend attention cancellation to dual-stream separation via cross-stream subtraction. (3) Curriculum training progressively strengthens differential separation through depth-dependent initialization and epoch-wise warmup. Extensive experiments on synthetic and real-world benchmarks demonstrate state-of-the-art performance with superior perceptual quality and robust generalization. Our code is available at https://github.com/wuw2135/ReflexSplit.