UfM*: Uncertainty from Motion* for DNN Depth Estimation Using Gaussians

📅 2026-05-21
📈 Citations: 0
Influential: 0
📄 PDF

career value

227K/year
🤖 AI Summary
This work addresses the lack of efficient and reliable uncertainty quantification in monocular depth estimation for safety-critical robotic systems, where existing approaches suffer from high computational and memory overhead and struggle to capture multi-view inconsistency. The authors propose the first method that leverages motion cues by employing a Gaussian mixture model to efficiently compare depth predictions across adjacent frames, enabling measurement of multi-view inconsistency and estimation of heteroscedastic uncertainty with only a single DNN forward pass. Innovatively replacing point clouds with Gaussian representations for 3D spatial regions significantly reduces computational and memory costs while improving calibration performance. Experiments on out-of-distribution ScanNet sequences show a 24–28% reduction in expected calibration error compared to ensemble methods, with energy consumption and memory usage reduced to 3% and 0.02%, respectively, achieving real-time inference at 30 FPS and 63 mJ per frame on an Arm Cortex-A76 processor.
📝 Abstract
Reliable uncertainty estimation is critical for deploying monocular depth deep neural networks (DNNs) in safety-critical robotic systems. Conventional uncertainty methods such as ensembles and sampling-based approaches require multiple inferences per image, incurring substantial compute and memory overhead. Moreover, uncertainty predicted from a single image misses out on measuring disagreement between predictions across views of the same region. We propose Uncertainty from Motion* (UfM*), an uncertainty estimation algorithm that measures multiview disagreement efficiently by comparing previous and current views using a compact Gaussian mixture, requiring only a single DNN inference per image. Using Gaussians to compute multiview disagreement is not only more compute- and memory-efficient than a prior approach using a point cloud, but also improves uncertainty by measuring disagreement across regions of 3D space. UfM* paired with aleatoric uncertainty improves expected calibration error by 24-28% compared to an ensemble, while requiring only 3% of the energy and 0.02% of the memory on 100 out-of-distribution ScanNet sequences. We demonstrate UfM* consumes only 63 mJ per 224x224 image while running real-time at 30 FPS on an Arm Cortex-A76 CPU onboard a miniature energy-constrained robot, highlighting that measuring multiview disagreement using Gaussians enables efficient uncertainty for resource-constrained robotic systems.
Problem

Research questions and friction points this paper is trying to address.

uncertainty estimation
monocular depth estimation
multiview disagreement
resource-constrained robotics
deep neural networks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uncertainty from Motion
Gaussian mixture
monocular depth estimation
multiview disagreement
efficient uncertainty