🤖 AI Summary
Existing white balance (WB) correction methods suffer from low accuracy in multi-illuminant scenes, and conventional linear fusion approaches fail to ensure cross-region color consistency. Method: This paper proposes the first Transformer-based image fusion framework tailored for multi-illuminant WB correction. It performs end-to-end nonlinear fusion of multiple preset-WB images of the same scene directly in the sRGB domain—replacing traditional linear weighting—and introduces a lightweight spatial-illumination-aware Transformer module to jointly model local chromatic bias and global illumination interactions. Contribution/Results: To enable training and evaluation, we introduce ML-WB, the first large-scale multi-illuminant WB dataset comprising over 16,000 images, covering five preset WB settings under complex mixed-illumination conditions. Experiments on ML-WB demonstrate that our method achieves up to 100% improvement over state-of-the-art methods, significantly enhancing both color consistency and cross-scene generalization capability.
📝 Abstract
White balance (WB) correction in scenes with multiple illuminants remains a persistent challenge in computer vision. Recent methods explored fusion-based approaches, where a neural network linearly blends multiple sRGB versions of an input image, each processed with predefined WB presets. However, we demonstrate that these methods are suboptimal for common multi-illuminant scenarios. Additionally, existing fusion-based methods rely on sRGB WB datasets lacking dedicated multi-illuminant images, limiting both training and evaluation. To address these challenges, we introduce two key contributions. First, we propose an efficient transformer-based model that effectively captures spatial dependencies across sRGB WB presets, substantially improving upon linear fusion techniques. Second, we introduce a large-scale multi-illuminant dataset comprising over 16,000 sRGB images rendered with five different WB settings, along with WB-corrected images. Our method achieves up to 100% improvement over existing techniques on our new multi-illuminant image fusion dataset.