๐ค AI Summary
To address the challenge of diverse image degradations under complex illumination conditions (e.g., low-light, backlighting) and the poor generalizability of existing methods, this paper proposes a dual-illumination adaptive enhancement framework. Grounded in Retinex theory, the framework integrates illumination-aware cross-attention and sequential global attention modules to jointly model illumination conditions and suppress artifacts. A novel sparse gating mechanism dynamically routes inputs to multiple S-curve expert networks, enabling adaptive specialization for distinct degradation patterns. Furthermore, a Mixture-of-Experts (MoE) strategy is employed for end-to-end illumination estimation. Extensive experiments on both synthetic and real-world low-light/backlight datasets demonstrate state-of-the-art performance. Crucially, the framework achieves zero-shot cross-scenario generalization without retraining, significantly improving not only perceptual image quality but also downstream vision tasksโincluding object detection and semantic segmentation.
๐ Abstract
Image degradation caused by complex lighting conditions such as low-light and backlit scenarios is commonly encountered in real-world environments, significantly affecting image quality and downstream vision tasks. Most existing methods focus on a single type of illumination degradation and lack the ability to handle diverse lighting conditions in a unified manner. To address this issue, we propose a dual-illumination enhancement framework called DIME-Net. The core of our method is a Mixture-of-Experts illumination estimator module, where a sparse gating mechanism adaptively selects suitable S-curve expert networks based on the illumination characteristics of the input image. By integrating Retinex theory, this module effectively performs enhancement tailored to both low-light and backlit images. To further correct illumination-induced artifacts and color distortions, we design a damage restoration module equipped with Illumination-Aware Cross Attention and Sequential-State Global Attention mechanisms. In addition, we construct a hybrid illumination dataset, MixBL, by integrating existing datasets, allowing our model to achieve robust illumination adaptability through a single training process. Experimental results show that DIME-Net achieves competitive performance on both synthetic and real-world low-light and backlit datasets without any retraining. These results demonstrate its generalization ability and potential for practical multimedia applications under diverse and complex illumination conditions.