Depth Anything with Any Prior

📅 2025-05-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of integrating sparse yet metrically accurate depth priors with dense but relative predicted depths to produce accurate, dense, and absolutely scaled depth maps in monocular depth estimation (MDE). To this end, we propose Prior Depth Anything—a novel framework introducing a coarse-to-fine, dual-source collaborative fusion paradigm. Key innovations include pixel-level metric alignment, distance-aware weighted pre-filling, and a normalized prior-conditioned MDE module. The framework enables zero-shot cross-task generalization and plug-and-play model upgrading at test time. Evaluated on seven real-world datasets, it achieves zero-shot performance superior to task-specific methods. It sets new state-of-the-art results across diverse scenarios—including depth completion, super-resolution, inpainting, and hybrid prior integration—while offering adjustable trade-offs between accuracy and efficiency.

Technology Category

Application Category

📝 Abstract
This work presents Prior Depth Anything, a framework that combines incomplete but precise metric information in depth measurement with relative but complete geometric structures in depth prediction, generating accurate, dense, and detailed metric depth maps for any scene. To this end, we design a coarse-to-fine pipeline to progressively integrate the two complementary depth sources. First, we introduce pixel-level metric alignment and distance-aware weighting to pre-fill diverse metric priors by explicitly using depth prediction. It effectively narrows the domain gap between prior patterns, enhancing generalization across varying scenarios. Second, we develop a conditioned monocular depth estimation (MDE) model to refine the inherent noise of depth priors. By conditioning on the normalized pre-filled prior and prediction, the model further implicitly merges the two complementary depth sources. Our model showcases impressive zero-shot generalization across depth completion, super-resolution, and inpainting over 7 real-world datasets, matching or even surpassing previous task-specific methods. More importantly, it performs well on challenging, unseen mixed priors and enables test-time improvements by switching prediction models, providing a flexible accuracy-efficiency trade-off while evolving with advancements in MDE models.
Problem

Research questions and friction points this paper is trying to address.

Combining incomplete metric depth with relative depth prediction
Generating accurate dense metric depth maps for any scene
Enhancing generalization across diverse depth measurement scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Coarse-to-fine pipeline integrates complementary depth sources
Pixel-level metric alignment enhances generalization across scenarios
Conditioned MDE model refines noise and merges depth sources