🤖 AI Summary
Shadow modeling in deep learning suffers from fragmented evaluation protocols, poor generalization, and non-reproducible experiments. Method: We introduce the first standardized benchmark for shadow-related tasks—detection, removal, and generation—in images and videos, encompassing formal task definitions, unified model architectures, curated datasets, consistent evaluation metrics, and cross-domain generalization analysis. We further propose a lightweight, efficient framework integrating multi-scale feature fusion and self-supervised pretraining, and empirically characterize the model-size–speed–performance trade-off. Contribution/Results: We conduct fair, reproducible, cross-method evaluations on major benchmarks, systematically exposing generalization bottlenecks. All components—including the benchmark suite, codebase, and pre-trained models—are publicly released to advance standardization and practical deployment of shadow modeling.
📝 Abstract
Shadows are created when light encounters obstacles, resulting in regions of reduced illumination. In computer vision, detecting, removing, and generating shadows are critical tasks for improving scene understanding, enhancing image quality, ensuring visual consistency in video editing, and optimizing virtual environments. This paper offers a comprehensive survey and evaluation benchmark on shadow detection, removal, and generation in both images and videos, focusing on the deep learning approaches of the past decade. It covers key aspects such as tasks, deep models, datasets, evaluation metrics, and comparative results under consistent experimental settings. Our main contributions include a thorough survey of shadow analysis, the standardization of experimental comparisons, an exploration of the relationships between model size, speed, and performance, a cross-dataset generalization study, the identification of open challenges and future research directions, and the provision of publicly available resources to support further research in this field.