Learning Inverse Laplacian Pyramid for Progressive Depth Completion

📅 2025-02-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Depth completion aims to reconstruct dense, accurate depth maps from sparse depth measurements and corresponding RGB images. Existing single-scale propagation methods suffer from low computational efficiency and limited capacity for modeling scene context. To address these limitations, we propose LP-Net, the first framework introducing an inverse Laplacian pyramid paradigm for progressive depth prediction: it first captures global structural priors and then incrementally refines high-frequency details across pyramid levels. We design a multi-path feature pyramid module to enhance multi-scale contextual awareness and introduce a selective depth filtering module that enables dynamic, learnable smoothing or sharpening—balancing edge fidelity and noise robustness. LP-Net achieves state-of-the-art performance on KITTI (ranked #1 on the official benchmark), NYUv2, and TOFDC, delivering significant improvements in both accuracy and inference efficiency.

Technology Category

Application Category

📝 Abstract
Depth completion endeavors to reconstruct a dense depth map from sparse depth measurements, leveraging the information provided by a corresponding color image. Existing approaches mostly hinge on single-scale propagation strategies that iteratively ameliorate initial coarse depth estimates through pixel-level message passing. Despite their commendable outcomes, these techniques are frequently hampered by computational inefficiencies and a limited grasp of scene context. To circumvent these challenges, we introduce LP-Net, an innovative framework that implements a multi-scale, progressive prediction paradigm based on Laplacian Pyramid decomposition. Diverging from propagation-based approaches, LP-Net initiates with a rudimentary, low-resolution depth prediction to encapsulate the global scene context, subsequently refining this through successive upsampling and the reinstatement of high-frequency details at incremental scales. We have developed two novel modules to bolster this strategy: 1) the Multi-path Feature Pyramid module, which segregates feature maps into discrete pathways, employing multi-scale transformations to amalgamate comprehensive spatial information, and 2) the Selective Depth Filtering module, which dynamically learns to apply both smoothness and sharpness filters to judiciously mitigate noise while accentuating intricate details. By integrating these advancements, LP-Net not only secures state-of-the-art (SOTA) performance across both outdoor and indoor benchmarks such as KITTI, NYUv2, and TOFDC, but also demonstrates superior computational efficiency. At the time of submission, LP-Net ranks 1st among all peer-reviewed methods on the official KITTI leaderboard.
Problem

Research questions and friction points this paper is trying to address.

Reconstruct dense depth from sparse measurements
Improve computational efficiency in depth completion
Enhance scene context understanding in depth maps
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-scale Laplacian Pyramid decomposition
Multi-path Feature Pyramid module
Selective Depth Filtering module
🔎 Similar Papers
No similar papers found.
K
Kun Wang
PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
Zhiqiang Yan
Zhiqiang Yan
National University of Singapore
3D computer visiondepth perceptionoccupancy prediction
Junkai Fan
Junkai Fan
Nanjing University of Science and Technology
image/video restorationdepth estimation
J
Jun Li
PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
J
Jian Yang
PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, and Jiangsu Key Lab of Image and Video Understanding for Social Security, School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China