The Midas Touch for Metric Depth

πŸ“… 2026-05-12
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

238K/year
πŸ€– AI Summary
This work addresses the challenges of metric scale ambiguity, local inconsistencies, and computational inefficiency in relative depth estimation by introducing an interpretable mathematical framework capable of achieving high-accuracy metric depth estimation from extremely sparse 3D inputs. The proposed method integrates piecewise depth recovery with a discontinuity-aware geodesic cost function and refines predictions at the pixel level through a lightweight, plug-and-play architecture. This design substantially enhances global consistency and generalization performance, outperforming state-of-the-art approaches across multiple depth completion and estimation benchmarks. Moreover, the framework demonstrates strong potential for efficient deployment and broad applicability to diverse downstream 3D vision tasks.
πŸ“ Abstract
Recent advances have markedly improved the cross-scene generalization of relative depth estimation, yet its practical applicability remains limited by the absence of metric scale, local inconsistencies, and low computational efficiency. To address these issues, we present \emph{\textbf{M}idas \textbf{T}ouch for \textbf{D}epth} (MTD), a mathematically interpretable approach that converts relative depth into metric depth using only extremely sparse 3D data. To eliminate local scale inconsistencies, it applies a segment-wise recovery strategy via sparse graph optimization, followed by a pixel-wise refinement strategy using a discontinuity-aware geodesic cost. MTD exhibits strong generalization and achieves substantial accuracy improvements over previous depth completion and depth estimation methods. Moreover, its lightweight, plug-and-play design facilitates deployment and integration on diverse downstream 3D tasks. Project page is available at https://mias.group/MTD.
Problem

Research questions and friction points this paper is trying to address.

metric depth
relative depth estimation
scale inconsistency
computational efficiency
depth completion
Innovation

Methods, ideas, or system contributions that make the work stand out.

metric depth estimation
sparse graph optimization
discontinuity-aware geodesic cost
segment-wise recovery
plug-and-play depth conversion
Yu Ma
Yu Ma
Indiana University
Computer Science
Z
Zizhan Guo
College of Electronic and Information Engineering, Tongji University
Z
Zuyi Xiong
College of Electronic and Information Engineering, Tongji University
H
Haoran Zhang
College of Electronic and Information Engineering, Tongji University
Yi Feng
Yi Feng
Tongji univercity
computer vision
H
Hongbo Zhao
College of Electronic and Information Engineering, Tongji University
Hanli Wang
Hanli Wang
Tongji University
Multimedia ComputingComputer VisionImage ProcessingMachine Learning
R
Rui Fan
College of Electronic and Information Engineering, Tongji University; Shanghai Research Institute for Intelligent Autonomous Systems, Tongji University; National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi’an Jiaotong University