Seeing and Seeing Through the Glass: Real and Synthetic Data for Multi-Layer Depth Estimation

📅 2025-03-14
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Estimating multi-layer depth for transparent objects—simultaneously perceiving both the transparent surface and occluded background objects—is critical for robotic manipulation, yet existing methods exhibit limited performance. To address this, we introduce LayeredDepth, the first benchmark dataset integrating real (1,500 images) and synthetically rendered (15,300 images) multi-layer depth data, and propose the first standardized multi-layer depth annotation paradigm. Our method leverages procedural rendering to generate synthetic images with pixel-accurate ground-truth multi-layer depth, followed by fine-tuning a single-layer depth model and cross-domain transfer learning. Experiments show that the fine-tuned model achieves 75.20% four-point accuracy on the real-world subset—outperforming baselines by 20.06%. Remarkably, even models trained solely on synthetic data demonstrate strong cross-domain generalization for multi-layer depth estimation. This work establishes a new benchmark, introduces a principled annotation framework, and provides an effective methodology for depth perception in transparent scenes.

Technology Category

Application Category

📝 Abstract
Transparent objects are common in daily life, and understanding their multi-layer depth information -- perceiving both the transparent surface and the objects behind it -- is crucial for real-world applications that interact with transparent materials. In this paper, we introduce LayeredDepth, the first dataset with multi-layer depth annotations, including a real-world benchmark and a synthetic data generator, to support the task of multi-layer depth estimation. Our real-world benchmark consists of 1,500 images from diverse scenes, and evaluating state-of-the-art depth estimation methods on it reveals that they struggle with transparent objects. The synthetic data generator is fully procedural and capable of providing training data for this task with an unlimited variety of objects and scene compositions. Using this generator, we create a synthetic dataset with 15,300 images. Baseline models training solely on this synthetic dataset produce good cross-domain multi-layer depth estimation. Fine-tuning state-of-the-art single-layer depth models on it substantially improves their performance on transparent objects, with quadruplet accuracy on our benchmark increased from 55.14% to 75.20%. All images and validation annotations are available under CC0 at https://layereddepth.cs.princeton.edu.
Problem

Research questions and friction points this paper is trying to address.

Estimating multi-layer depth for transparent objects
Creating a dataset with real and synthetic depth annotations
Improving depth estimation models for transparent materials
Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces LayeredDepth dataset for multi-layer depth estimation
Includes real-world benchmark and synthetic data generator
Synthetic data improves depth estimation on transparent objects
🔎 Similar Papers
No similar papers found.
Hongyu Wen
Hongyu Wen
PhD student, Princeton University
Yiming Zuo
Yiming Zuo
Princeton University
Computer Vision
V
Venkat Subramanian
Department of Computer Science, Princeton University
P
Patrick Chen
Department of Computer Science, Princeton University
J
Jia Deng
Department of Computer Science, Princeton University