Degradation-Modeled Multipath Diffusion for Tunable Metalens Photography

📅 2025-06-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Metalens imaging suffers from severe spatially non-uniform optical degradation, heavy reliance on precise calibration or large-scale paired training data, and susceptibility to hallucination artifacts. To address these challenges, we propose an unpaired, tunable multi-path diffusion framework. Our method introduces a tripartite prompting mechanism—comprising positive, neutral, and negative prompts—to jointly suppress degradation and guide fine-grained detail synthesis. We further design a Spatially Variant Distortion-Aware (SVDA) attention module to adaptively model millimeter-scale, non-uniform aberrations inherent in MetaCamera systems. A tunable decoder is incorporated to explicitly balance reconstruction fidelity and perceptual quality. By synergistically integrating physics-based priors with natural image priors from pre-trained diffusion models—and augmented with pseudo-data generation—the framework achieves state-of-the-art performance on real metalens hardware: significantly enhancing image sharpness and fidelity while effectively suppressing hallucinations, outperforming both existing supervised and unsupervised approaches.

Technology Category

Application Category

📝 Abstract
Metalenses offer significant potential for ultra-compact computational imaging but face challenges from complex optical degradation and computational restoration difficulties. Existing methods typically rely on precise optical calibration or massive paired datasets, which are non-trivial for real-world imaging systems. Furthermore, a lack of control over the inference process often results in undesirable hallucinated artifacts. We introduce Degradation-Modeled Multipath Diffusion for tunable metalens photography, leveraging powerful natural image priors from pretrained models instead of large datasets. Our framework uses positive, neutral, and negative-prompt paths to balance high-frequency detail generation, structural fidelity, and suppression of metalens-specific degradation, alongside extit{pseudo} data augmentation. A tunable decoder enables controlled trade-offs between fidelity and perceptual quality. Additionally, a spatially varying degradation-aware attention (SVDA) module adaptively models complex optical and sensor-induced degradation. Finally, we design and build a millimeter-scale MetaCamera for real-world validation. Extensive results show that our approach outperforms state-of-the-art methods, achieving high-fidelity and sharp image reconstruction. More materials: https://dmdiff.github.io/.
Problem

Research questions and friction points this paper is trying to address.

Overcoming metalens optical degradation and restoration challenges
Reducing reliance on precise calibration or large datasets
Controlling inference to avoid hallucinated artifacts
Innovation

Methods, ideas, or system contributions that make the work stand out.

Degradation-Modeled Multipath Diffusion balances detail and fidelity
Tunable decoder controls fidelity and perceptual quality
SVDA module adaptively models optical degradation