Adapting Depth Anything to Adverse Imaging Conditions with Events

📅 2026-01-05
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the significant performance degradation of existing depth estimation models, such as Depth Anything, under adverse imaging conditions including extreme illumination and motion blur. To tackle this challenge, we propose ADAE, a plug-and-play framework that seamlessly integrates event camera data into foundation depth models without requiring retraining, thereby preserving their open-world generalization capability. ADAE jointly models illumination degradation and motion blur through entropy-aware spatial feature fusion and event-guided temporal motion correction, establishing a unified spatiotemporal fusion architecture. Experimental results demonstrate that ADAE substantially enhances the accuracy and robustness of Depth Anything across diverse degradation scenarios.

Technology Category

Application Category

📝 Abstract
Robust depth estimation under dynamic and adverse lighting conditions is essential for robotic systems. Currently, depth foundation models, such as Depth Anything, achieve great success in ideal scenes but remain challenging under adverse imaging conditions such as extreme illumination and motion blur. These degradations corrupt the visual signals of frame cameras, weakening the discriminative features of frame-based depths across the spatial and temporal dimensions. Typically, existing approaches incorporate event cameras to leverage their high dynamic range and temporal resolution, aiming to compensate for corrupted frame features. However, such specialized fusion models are predominantly trained from scratch on domain-specific datasets, thereby failing to inherit the open-world knowledge and robust generalization inherent to foundation models. In this work, we propose ADAE, an event-guided spatiotemporal fusion framework for Depth Anything in degraded scenes. Our design is guided by two key insights: 1) Entropy-Aware Spatial Fusion. We adaptively merge frame-based and event-based features using an information entropy strategy to indicate illumination-induced degradation. 2) Motion-Guided Temporal Correction. We resort to the event-based motion cue to recalibrate ambiguous features in blurred regions. Under our unified framework, the two components are complementary to each other and jointly enhance Depth Anything under adverse imaging conditions. Extensive experiments have been performed to verify the superiority of the proposed method. Our code will be released upon acceptance.
Problem

Research questions and friction points this paper is trying to address.

depth estimation
adverse imaging conditions
event cameras
motion blur
extreme illumination
Innovation

Methods, ideas, or system contributions that make the work stand out.

event camera
depth estimation
adverse imaging conditions
spatiotemporal fusion
foundation model adaptation
🔎 Similar Papers
No similar papers found.
Shihan Peng
Shihan Peng
Huazhong University of Science and Technology
Computer VisionDepth EstimationEvent Camera
Y
Yuyang Xiong
National Key Lab of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
Hanyu Zhou
Hanyu Zhou
School of Computing, National University of Singapore
Scene UnderstandingMultimodal LearningEvent CameraDomain Adaptation.
Z
Zhiwei Shi
National Key Lab of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
Haoyue Liu
Haoyue Liu
School of Artificial Intelligence and Automation, Huazhong University of Science and Technology
Computer VisionEvent Camera
Gang Chen
Gang Chen
Sun Yat-sen University
Domain-specific AcceleratorRoboticsEmbedded systems
Luxin Yan
Luxin Yan
Huazhong University of Science and Technology
Computer VisionImage ProcessingDeep Learning
Y
Yi Chang
National Key Lab of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China