Confidence-aware Monocular Depth Estimation for Minimally Invasive Surgery

📅 2026-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of inaccurate and unreliable monocular depth estimation in endoscopic videos, which are frequently degraded by smoke, specular reflections, blur, and occlusions. To this end, the authors propose a confidence-aware monocular depth estimation framework that leverages a fine-tuned stereo matching model to generate pixel-wise confidence targets, incorporates a novel confidence-aware loss function during training, and introduces a lightweight confidence prediction head at inference time to quantify the reliability of depth predictions. Evaluated on the in-house clinical StereoKP dataset, the method improves dense depth estimation accuracy by approximately 8% and demonstrates robust and effective confidence prediction across multiple public and internal datasets.

Technology Category

Application Category

📝 Abstract
Purpose: Monocular depth estimation (MDE) is vital for scene understanding in minimally invasive surgery (MIS). However, endoscopic video sequences are often contaminated by smoke, specular reflections, blur, and occlusions, limiting the accuracy of MDE models. In addition, current MDE models do not output depth confidence, which could be a valuable tool for improving their clinical reliability. Methods: We propose a novel confidence-aware MDE framework featuring three significant contributions: (i) Calibrated confidence targets: an ensemble of fine-tuned stereo matching models is used to capture disparity variance into pixel-wise confidence probabilities; (ii) Confidence-aware loss: Baseline MDE models are optimized with confidence-aware loss functions, utilizing pixel-wise confidence probabilities such that reliable pixels dominate training; and (iii) Inference-time confidence: a confidence estimation head is proposed with two convolution layers to predict per-pixel confidence at inference, enabling assessment of depth reliability. Results: Comprehensive experimental validation across internal and public datasets demonstrates that our framework improves depth estimation accuracy and can robustly quantify the prediction's confidence. On the internal clinical endoscopic dataset (StereoKP), we improve dense depth estimation accuracy by ~8% as compared to the baseline model. Conclusion: Our confidence-aware framework enables improved accuracy of MDE models in MIS, addressing challenges posed by noise and artifacts in pre-clinical and clinical data, and allows MDE models to provide confidence maps that may be used to improve their reliability for clinical applications.
Problem

Research questions and friction points this paper is trying to address.

Monocular Depth Estimation
Minimally Invasive Surgery
Depth Confidence
Endoscopic Video
Scene Understanding
Innovation

Methods, ideas, or system contributions that make the work stand out.

confidence-aware depth estimation
monocular depth estimation
minimally invasive surgery
pixel-wise confidence
endoscopic video
🔎 Similar Papers
No similar papers found.