🤖 AI Summary
Reflection artifacts from transparent objects (e.g., glass) severely distort transmitted light images, hindering accurate scene recovery.
Method: This paper proposes a reflection–transmission separation method using a single flash/non-flash image pair. To address misalignment between the two images, we design a dual-branch latent-space diffusion model that jointly encodes both latent representations to suppress registration errors; a cross-latent variable decoding mechanism is further introduced for high-fidelity detail reconstruction. The framework integrates flash-guided cue conditioning, conditional latent-space generation, and an end-to-end trainable architecture.
Contribution/Results: Evaluated on real-world complex scenes, our method achieves state-of-the-art reflection separation performance, significantly outperforming existing baselines. It delivers consistent improvements in both quantitative metrics (e.g., PSNR, SSIM) and visual quality, demonstrating robustness to challenging reflections and geometric mismatches.
📝 Abstract
Transparent surfaces, such as glass, create complex reflections that obscure images and challenge downstream computer vision applications. We introduce Flash-Split, a robust framework for separating transmitted and reflected light using a single (potentially misaligned) pair of flash/no-flash images. Our core idea is to perform latent-space reflection separation while leveraging the flash cues. Specifically, Flash-Split consists of two stages. Stage 1 separates apart the reflection latent and transmission latent via a dual-branch diffusion model conditioned on an encoded flash/no-flash latent pair, effectively mitigating the flash/no-flash misalignment issue. Stage 2 restores high-resolution, faithful details to the separated latents, via a cross-latent decoding process conditioned on the original images before separation. By validating Flash-Split on challenging real-world scenes, we demonstrate state-of-the-art reflection separation performance and significantly outperform the baseline methods.