Iris: Bringing Real-World Priors into Diffusion Model for Monocular Depth Estimation

📅 2026-03-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Monocular depth estimation remains challenging in preserving fine details, generalizing to real-world scenes, and learning effectively from limited training data. This work proposes Iris, a novel framework that introduces Spectral Gated Distillation (SGD) and Spectral Gated Consistency (SGC) mechanisms, integrated via a two-stage Priors-to-Geometry Deterministic (PGD) scheduling strategy to effectively incorporate real-world priors into diffusion models. By synergistically leveraging both high- and low-frequency information, the method achieves a balanced trade-off between geometric fidelity and generalization during synthetic-to-real domain transfer. Consequently, Iris significantly improves depth estimation accuracy under data-scarce conditions and demonstrates superior generalization performance in diverse real-world scenarios.

Technology Category

Application Category

📝 Abstract
In this paper, we propose \textbf{Iris}, a deterministic framework for Monocular Depth Estimation (MDE) that integrates real-world priors into the diffusion model. Conventional feed-forward methods rely on massive training data, yet still miss details. Previous diffusion-based methods leverage rich generative priors yet struggle with synthetic-to-real domain transfer. Iris, in contrast, preserves fine details, generalizes strongly from synthetic to real scenes, and remains efficient with limited training data. To this end, we introduce a two-stage Priors-to-Geometry Deterministic (PGD) schedule: the prior stage uses Spectral-Gated Distillation (SGD) to transfer low-frequency real priors while leaving high-frequency details unconstrained, and the geometry stage applies Spectral-Gated Consistency (SGC) to enforce high-frequency fidelity while refining with synthetic ground truth. The two stages share weights and are executed with a high-to-low timestep schedule. Extensive experimental results confirm that Iris achieves significant improvements in MDE performance with strong in-the-wild generalization.
Problem

Research questions and friction points this paper is trying to address.

Monocular Depth Estimation
Diffusion Model
Real-World Priors
Domain Transfer
Detail Preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion model
monocular depth estimation
real-world priors
spectral-gated distillation
domain generalization
🔎 Similar Papers
No similar papers found.
Xinhao Cai
Xinhao Cai
Nanjing University of Science and Technology
computer visionmachine learning
G
Gensheng Pei
Department of Electrical and Computer Engineering, Sungkyunkwan University
Zeren Sun
Zeren Sun
Associate Professor, Nanjing University of Science and Technology
computer visiondeep learningfine-grained visual recognitionlearning from label noise
Y
Yazhou Yao
Nanjing University of Science and Technology; State Key Laboratory of Intelligent Manufacturing of Advanced Construction Machinery
F
Fumin Shen
University of Electronic Science and Technology of China
Wenguan Wang
Wenguan Wang
Zhejiang University
Neural-Symbolic AIEmbodied AIAutonomous CarsComputer VisionArtificial Intelligence