A Data Efficiency Study of Synthetic Fog for Object Detection Using the Clear2Fog Pipeline

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

This work addresses the performance degradation of object detection in autonomous driving under foggy conditions, primarily caused by the scarcity of real-world labeled foggy data. To this end, the authors propose Clear2Fog (C2F), an end-to-end physically grounded simulation pipeline that generates sensor-consistent synthetic foggy images from clear-weather inputs. The method integrates monocular depth estimation with a novel airlight estimation module to effectively mitigate structural artifacts and color shifts. Experiments demonstrate that training on mixed-density fog data at 75% scale outperforms training on full-scale single-density data, and a proposed tenfold learning rate fine-tuning strategy substantially alleviates negative sim-to-real transfer. In human perceptual studies, the synthesized images achieved a 92.95% preference rate, and fine-tuning on them improved mAP by 1.67 over a baseline trained solely on real foggy data, confirming the approach’s efficacy in enhancing model robustness and data efficiency.

📝 Abstract

Object detection in adverse weather is critical for the safety of autonomous vehicles; however, the scarcity of labelled, real-world foggy data remains a significant bottleneck. In this paper, we propose Clear2Fog (C2F), an end-to-end, physics-based pipeline that simulates fog on clear-weather datasets while ensuring sensor-level consistency across camera and LiDAR. By using monocular depth estimation and a novel atmospheric light estimation method, C2F overcomes structural artifacts and chromatic biases common in existing techniques. A human perceptual study confirms C2F's physical realism, with the generated images being preferred 92.95% of the time over an established method. Utilising a training set of 270,000 images from the Waymo Open Dataset, we conduct an extensive data efficiency study to investigate how environmental diversity influences model robustness. Our findings reveal that models trained on mixed-density fog datasets at 75% scale outperform those trained on fixed-density datasets at 100% scale. Furthermore, we investigate the sim-to-real transfer by fine-tuning pre-trained models on real-world foggy data. We demonstrate that a tenfold increase over the default fine-tuning learning rate successfully overcomes negative transfer from synthetic biases, resulting in a 1.67 mAP improvement over real-only baselines. The C2F pipeline provides a scalable framework for enhancing the reliability of autonomous systems in adverse weather and demonstrates the potential of diverse synthetic datasets for efficient model training.

Problem

Research questions and friction points this paper is trying to address.

object detection

adverse weather

fog

data scarcity

autonomous vehicles

Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic fog

physics-based rendering

data efficiency