SPADE: Sparsity Adaptive Depth Estimator for Zero-Shot, Real-Time, Monocular Depth Estimation in Underwater Environments

📅 2025-10-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Underwater infrastructure inspection is hindered by turbid water and complex structural geometries, while conventional manual or ROV-based approaches suffer from perceptual and operational bottlenecks. To address scale ambiguity and inefficient utilization of sparse depth priors in zero-shot monocular underwater depth estimation, this paper proposes a training-free, real-time method that enables adaptive depth densification and metric-scale recovery. Our approach uniquely integrates a pre-trained relative depth model with sparse depth priors within a two-stage framework: first performing sparse point cloud registration via a cascaded convolutional–deformable Transformer module, then refining the dense depth map using a lightweight Transformer. The method supports embedded deployment and achieves state-of-the-art accuracy and generalization on real underwater scenes, operating at over 15 FPS.

Technology Category

Application Category

📝 Abstract
Underwater infrastructure requires frequent inspection and maintenance due to harsh marine conditions. Current reliance on human divers or remotely operated vehicles is limited by perceptual and operational challenges, especially around complex structures or in turbid water. Enhancing the spatial awareness of underwater vehicles is key to reducing piloting risks and enabling greater autonomy. To address these challenges, we present SPADE: SParsity Adaptive Depth Estimator, a monocular depth estimation pipeline that combines pre-trained relative depth estimator with sparse depth priors to produce dense, metric scale depth maps. Our two-stage approach first scales the relative depth map with the sparse depth points, then refines the final metric prediction with our proposed Cascade Conv-Deformable Transformer blocks. Our approach achieves improved accuracy and generalisation over state-of-the-art baselines and runs efficiently at over 15 FPS on embedded hardware, promising to support practical underwater inspection and intervention. This work has been submitted to IEEE Journal of Oceanic Engineering Special Issue of AUV 2026.
Problem

Research questions and friction points this paper is trying to address.

Estimating dense metric depth from monocular underwater images
Overcoming perceptual limitations in turbid water environments
Enabling real-time depth estimation for autonomous underwater vehicles
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines pre-trained depth estimator with sparse priors
Uses two-stage scaling and refinement approach
Runs efficiently on embedded hardware at 15 FPS
🔎 Similar Papers
No similar papers found.
Hongjie Zhang
Hongjie Zhang
Nanjing University; Shanghai Artificial Intelligence Laboratory
Computer Vision
G
Gideon Billings
Australian Centre for Robotics, University of Sydney, Sydney, NSW 2006, Australia
S
Stefan B. Williams
Australian Centre for Robotics, University of Sydney, Sydney, NSW 2006, Australia