Efficient Multi-disparity Transformer for Light Field Image Super-resolution

📅 2024-07-22

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

212K/year

🤖 AI Summary

Existing light field (LF) super-resolution methods uniformly model sub-aperture images (SAIs), leading to parallax entanglement and computational redundancy. To address this, we propose the Multi-scale Disparity Transformer (MDT), the first disparity-aware “divide-and-conquer” Transformer architecture for LF SR. MDT explicitly disentangles parallax by employing a multi-branch structure and a novel Disparity-Sensitive Attention (DSA) mechanism that models distinct disparity ranges separately. Integrated with SAI-wise collaborative modeling and multi-scale feature fusion, it forms the lightweight LF-MDTNet. On 2× and 4× LF SR tasks, LF-MDTNet achieves PSNR gains of +0.37 dB and +0.41 dB over state-of-the-art methods, respectively, while reducing model parameters by 23% and accelerating inference by 1.8×. The approach thus advances accuracy, efficiency, and interpretability—enabling explicit parallax-aware representation learning in LF super-resolution.

Technology Category

Application Category

📝 Abstract

This paper presents the Multi-scale Disparity Transformer (MDT), a novel Transformer tailored for light field image super-resolution (LFSR) that addresses the issues of computational redundancy and disparity entanglement caused by the indiscriminate processing of sub-aperture images inherent in conventional methods. MDT features a multi-branch structure, with each branch utilising independent disparity self-attention (DSA) to target specific disparity ranges, effectively reducing computational complexity and disentangling disparities. Building on this architecture, we present LF-MDTNet, an efficient LFSR network. Experimental results demonstrate that LF-MDTNet outperforms existing state-of-the-art methods by 0.37 dB and 0.41 dB PSNR at the 2x and 4x scales, achieving superior performance with fewer parameters and higher speed.

Problem

Research questions and friction points this paper is trying to address.

Addresses data redundancy in light field images

Resolves disparity entanglement in image processing

Improves efficiency in light field super-resolution

Innovation

Methods, ideas, or system contributions that make the work stand out.

Skim Transformer with multi-branch structure

Attention score matrix on skimmed SAI subsets

Disparity-aware SkimLFSR for efficient super-resolution

🔎 Similar Papers

Unsupervised Learning of High-resolution Light Field Imaging via Beam Splitter-based Hybrid Lenses