๐ค AI Summary
Fixed-focus, large-aperture cameras (e.g., smart glasses) suffer from depth-dependent defocus blur and spatially varying optical aberrations due to shallow depth of field; existing deblurring models trained on open-source datasets exhibit significant domain shift and poor generalization to such real-world imaging conditions.
Method: We propose a physics-driven synthetic image generation framework that jointly models depth-dependent defocus blur and spatially varying aberrations for the first timeโenabling simulation-to-real domain alignment without fine-tuning on real data. A lightweight low-resolution synthetic dataset is used to train an end-to-end deep network capable of restoring high-resolution (12 MP) real images.
Contribution/Results: Our method achieves state-of-the-art performance across diverse challenging scenarios, demonstrating superior efficiency, scalability, and practical deployability compared to prior approaches.
๐ Abstract
Modern cameras with large apertures often suffer from a shallow depth of field, resulting in blurry images of objects outside the focal plane. This limitation is particularly problematic for fixed-focus cameras, such as those used in smart glasses, where adding autofocus mechanisms is challenging due to form factor and power constraints. Due to unmatched optical aberrations and defocus properties unique to each camera system, deep learning models trained on existing open-source datasets often face domain gaps and do not perform well in real-world settings. In this paper, we propose an efficient and scalable dataset synthesis approach that does not rely on fine-tuning with real-world data. Our method simultaneously models depth-dependent defocus and spatially varying optical aberrations, addressing both computational complexity and the scarcity of high-quality RGB-D datasets. Experimental results demonstrate that a network trained on our low resolution synthetic images generalizes effectively to high resolution (12MP) real-world images across diverse scenes.