🤖 AI Summary
This work addresses unsupervised depth estimation from a single coded-aperture defocused image, without requiring handcrafted priors or paired training data. We propose a physics-informed diffusion regularization framework that couples a differentiable coded-aperture forward model with a pre-trained diffusion model prior, enabling iterative posterior sampling optimization in the denoising latent space. Our method eliminates dependence on specific camera configurations and ground-truth annotations, supporting cross-device generalization. Evaluated on both synthetic and real-world data, it significantly outperforms U-Net-based baselines and conventional approaches—particularly under varying noise levels—demonstrating superior robustness and reconstruction accuracy. The framework establishes a new paradigm for unsupervised, physically interpretable, and generalizable computational imaging-based depth estimation.
📝 Abstract
We propose a single-snapshot depth-from-defocus (DFD) reconstruction method for coded-aperture imaging that replaces hand-crafted priors with a learned diffusion prior used purely as regularization. Our optimization framework enforces measurement consistency via a differentiable forward model while guiding solutions with the diffusion prior in the denoised image domain, yielding higher accuracy and stability than clas- sical optimization. Unlike U-Net-style regressors, our approach requires no paired defocus-RGBD training data and does not tie training to a specific camera configuration. Experiments on comprehensive simulations and a prototype camera demonstrate consistently strong RGBD reconstructions across noise levels, outperforming both U-Net baselines and a classical coded- aperture DFD method.