🤖 AI Summary
Transparent objects impede accurate depth sensing due to light refraction and reflection, while existing supervised methods rely on costly and labor-intensive ground-truth depth annotations. To address this, we propose a fully self-supervised depth completion framework that requires no labeled data. Our method leverages reliable depth measurements from non-transparent regions, synthetically simulates depth occlusions in transparent regions in a controllable manner, and employs the original (incomplete) depth map as a self-supervised reconstruction signal. We jointly optimize a depth completion network and a geometric consistency loss derived from reconstructions of non-transparent objects. Experiments demonstrate that our approach achieves performance comparable to fully supervised methods; moreover, pretraining on limited samples significantly improves generalization in few-shot scenarios. This work establishes an efficient, low-cost paradigm for depth perception of transparent objects.
📝 Abstract
The perception of transparent objects is one of the well-known challenges in computer vision. Conventional depth sensors have difficulty in sensing the depth of transparent objects due to refraction and reflection of light. Previous research has typically train a neural network to complete the depth acquired by the sensor, and this method can quickly and accurately acquire accurate depth maps of transparent objects. However, previous training relies on a large amount of annotation data for supervision, and the labeling of depth maps is costly. To tackle this challenge, we propose a new self-supervised method for training depth completion networks. Our method simulates the depth deficits of transparent objects within non-transparent regions and utilizes the original depth map as ground truth for supervision. Experiments demonstrate that our method achieves performance comparable to supervised approach, and pre-training with our method can improve the model performance when the training samples are small.