🤖 AI Summary
In monocular depth estimation, global normalization amplifies pseudo-label noise, severely limiting knowledge distillation performance. To address this, we propose a cross-context distillation framework with multi-teacher collaboration. First, we systematically analyze how depth normalization affects pseudo-label quality. Second, we design a cross-scale context modeling mechanism that jointly leverages global and local depth cues to enhance pseudo-label robustness. Third, we introduce a complementary multi-teacher distillation paradigm to mitigate the generalization bottleneck of single-teacher models. Our method pioneers cross-context distillation—explicitly bridging contextual information across scales for improved pseudo-label fidelity. Evaluated on NYUv2 and KITTI benchmarks, it achieves state-of-the-art performance, reducing AbsRel by 12.3% over prior methods. Qualitative results further confirm substantial improvements in depth detail preservation and boundary accuracy.
📝 Abstract
Monocular depth estimation (MDE) aims to predict scene depth from a single RGB image and plays a crucial role in 3D scene understanding. Recent advances in zero-shot MDE leverage normalized depth representations and distillation-based learning to improve generalization across diverse scenes. However, current depth normalization methods for distillation, relying on global normalization, can amplify noisy pseudo-labels, reducing distillation effectiveness. In this paper, we systematically analyze the impact of different depth normalization strategies on pseudo-label distillation. Based on our findings, we propose Cross-Context Distillation, which integrates global and local depth cues to enhance pseudo-label quality. Additionally, we introduce a multi-teacher distillation framework that leverages complementary strengths of different depth estimation models, leading to more robust and accurate depth predictions. Extensive experiments on benchmark datasets demonstrate that our approach significantly outperforms state-of-the-art methods, both quantitatively and qualitatively.