Revisiting Gradient-based Uncertainty for Monocular Depth Estimation

📅 2025-02-09

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Monocular depth estimation suffers from unreliable predictions under image ambiguities such as dynamic objects and shadows, necessitating pixel-wise uncertainty quantification for safety-critical applications. To address this, we propose a post-hoc, gradient-driven posterior uncertainty estimation method that requires no model retraining. Our approach introduces a novel uncertainty modeling framework based on multi-level feature map gradients, coupled with a reference-depth consistency auxiliary loss that operates without ground-truth supervision. Lightweight image and feature augmentations generate pseudo-labels to enable end-to-end uncertainty supervision. The method leverages only a standard monocular sequence-trained depth model—requiring neither additional annotations nor architectural modifications. Extensive experiments on KITTI and NYUv2 benchmarks demonstrate significant improvements over state-of-the-art uncertainty quantification methods, particularly in dynamic scenes, where robustness is markedly enhanced. Code and pretrained models are publicly available.

Technology Category

Application Category

📝 Abstract

Monocular depth estimation, similar to other image-based tasks, is prone to erroneous predictions due to ambiguities in the image, for example, caused by dynamic objects or shadows. For this reason, pixel-wise uncertainty assessment is required for safety-critical applications to highlight the areas where the prediction is unreliable. We address this in a post hoc manner and introduce gradient-based uncertainty estimation for already trained depth estimation models. To extract gradients without depending on the ground truth depth, we introduce an auxiliary loss function based on the consistency of the predicted depth and a reference depth. The reference depth, which acts as pseudo ground truth, is in fact generated using a simple image or feature augmentation, making our approach simple and effective. To obtain the final uncertainty score, the derivatives w.r.t. the feature maps from single or multiple layers are calculated using back-propagation. We demonstrate that our gradient-based approach is effective in determining the uncertainty without re-training using the two standard depth estimation benchmarks KITTI and NYU. In particular, for models trained with monocular sequences and therefore most prone to uncertainty, our method outperforms related approaches. In addition, we publicly provide our code and models: https://github.com/jhornauer/GrUMoDepth

Problem

Research questions and friction points this paper is trying to address.

Monocular depth estimation errors

Pixel-wise uncertainty assessment

Gradient-based uncertainty estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient-based uncertainty estimation

Auxiliary loss function

Feature map derivatives calculation

🔎 Similar Papers

Self-supervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion