🤖 AI Summary
Nighttime monocular depth estimation suffers from low accuracy due to insufficient illumination and often relies on expensive LiDAR sensors. Method: This paper proposes a novel light-augmented depth estimation paradigm: structured light patterns are projected via vehicle headlights, and their interaction with the scene is modeled via physically-based rendering; an end-to-end differentiable light-augmentation module is designed and integrated into mainstream monocular models (e.g., DepthFormer, AdaBins). Contribution/Results: We demonstrate for the first time that active illumination significantly enhances global geometric understanding—even in non-directly-illuminated regions. We introduce NSDD, the first large-scale synthetic nighttime driving dataset (49,990 finely annotated images). Extensive evaluation on both synthetic and real-world data shows an 18.3% reduction in AbsRel error and strong generalization, establishing a reliable, low-cost depth sensing pathway for LiDAR-free nighttime autonomous driving.
📝 Abstract
Nighttime camera-based depth estimation is a highly challenging task, especially for autonomous driving applications, where accurate depth perception is essential for ensuring safe navigation. We aim to improve the reliability of perception systems at night time, where models trained on daytime data often fail in the absence of precise but costly LiDAR sensors. In this work, we introduce Light Enhanced Depth (LED), a novel cost-effective approach that significantly improves depth estimation in low-light environments by harnessing a pattern projected by high definition headlights available in modern vehicles. LED leads to significant performance boosts across multiple depth-estimation architectures (encoder-decoder, Adabins, DepthFormer) both on synthetic and real datasets. Furthermore, increased performances beyond illuminated areas reveal a holistic enhancement in scene understanding. Finally, we release the Nighttime Synthetic Drive Dataset, a new synthetic and photo-realistic nighttime dataset, which comprises 49,990 comprehensively annotated images.