🤖 AI Summary
This work addresses the severe quadratic depth distortion introduced by existing holographic super-resolution methods during volumetric upsampling, which compromises three-dimensional focus accuracy. To overcome this limitation, the authors propose CV-HoloSR, a novel framework that leverages a complex-valued residual dense network and a depth-aware perceptual reconstruction loss to achieve, for the first time, physically consistent linear depth scaling in volumetric upsampling, thereby effectively recovering high-frequency interference details. The study introduces the first 4K holographic dataset tailored for large depth ranges and devises a complex-valued LoRA fine-tuning strategy that enables efficient adaptation to new depth and display configurations with only 200 samples. Experiments demonstrate a 32% improvement in LPIPS (reaching 0.2001), a 75% reduction in training time (from 22.5 to 5.2 hours), and successful generalization to unseen depth ranges and novel display setups.
📝 Abstract
Existing hologram super-resolution (HSR) methods primarily focus on angle-of-view expansion. Adapting them for volumetric spatial up-sampling introduces severe quadratic depth distortion, degrading 3D focal accuracy. We propose CV-HoloSR, a complex-valued HSR framework specifically designed to preserve physically consistent linear depth scaling during volume up-sampling. Built upon a Complex-Valued Residual Dense Network (CV-RDN) and optimized with a novel depth-aware perceptual reconstruction loss, our model effectively suppresses over-smoothing to recover sharp, high-frequency interference patterns. To support this, we introduce a comprehensive large-depth-range dataset with resolutions up to 4K. Furthermore, to overcome the inherent depth bias of pre-trained encoders when scaling to massive target volumes, we integrate a parameter-efficient fine-tuning strategy utilizing complex-valued Low-Rank Adaptation (LoRA). Extensive numerical and physical optical experiments demonstrate our method's superiority. CV-HoloSR achieves a 32% improvement in perceptual realism (LPIPS of 0.2001) over state-of-the-art baselines. Additionally, our tailored LoRA strategy requires merely 200 samples, reducing training time by over 75% (from 22.5 to 5.2 hours) while successfully adapting the pre-trained backbone to unseen depth ranges and novel display configurations.