π€ AI Summary
This work addresses the domain shift in RAW images arising from differences in exposure, spectral response, and bit depth across heterogeneous camera sensors. To tackle this challenge, the authors propose a physics-guided globalβlocal tone mapping framework that decouples RAW data variations into global tone correction and spatially adaptive local color adjustments. The model is trained using a physics-based RAW simulation pipeline, enabling, for the first time, a single unified detector to perform robust object detection across diverse RAW sensor domains. By effectively bridging the substantial domain gap inherent in RAW imagery, the method achieves state-of-the-art performance on multiple RAW benchmarks spanning 10- to 24-bit depths, consistently outperforming existing approaches in single-dataset, mixed-dataset, and robustness evaluations.
π Abstract
Camera sensor RAW data offers intrinsic advantages for object detection, including deeper bit depth, preserved physical information, and freedom from image signal processor (ISP) distortions. However, varying exposure conditions, spectral sensitivities, and bit depths across devices introduce substantially larger domain gaps than sRGB, making sensor-agnostic generalization a fundamental challenge. In this study, we present \textbf{RAWild}, a physics-guided global-local tone mapping framework for sensor-agnostic RAW object detection. By factoring sensor-induced variations into a global tonal correction and a spatially adaptive local color adjustment, both driven by RAW distribution priors, our framework enables a single network to train jointly across heterogeneous sensors. To further support cross-sensor generalization, we construct a physics-based RAW simulation pipeline that synthesizes realistic sensor outputs spanning diverse spectral sensitivities, illuminants, and sensor non-idealities. Extensive experiments across multiple RAW benchmarks covering bit depths from 10 to 24 demonstrate state-of-the-art (SOTA) performance under single-dataset, mixed-dataset, and challenging robustness settings.