🤖 AI Summary
To address geometric distortion-induced texture modeling bias and insufficient self-similarity exploitation in equi-rectangular projection (ERP) omnidirectional image super-resolution, this paper proposes a distortion-aware Transformer framework. Methodologically, it introduces: (1) a novel distortion-modulated rectangular-window self-attention mechanism integrated with deformable attention for spatially adaptive feature modeling; (2) a latitude-dependent distortion variability modeling module to guide distortion-aware reconstruction in the generator; and (3) a dynamic feature aggregation strategy to enhance cross-regional structural consistency. Evaluated on multiple public datasets, the method achieves significant PSNR/SSIM improvements—particularly in highly distorted polar and equatorial regions—and delivers superior visual quality over state-of-the-art approaches. It effectively balances global structural fidelity with local texture recovery.
📝 Abstract
As virtual and augmented reality applications gain popularity, omnidirectional image (ODI) super-resolution has become increasingly important. Unlike 2D plain images that are formed on a plane, ODIs are projected onto spherical surfaces. Applying established image super-resolution methods to ODIs, therefore, requires performing equirectangular projection (ERP) to map the ODIs onto a plane. ODI super-resolution needs to take into account geometric distortion resulting from ERP. However, without considering such geometric distortion of ERP images, previous deep-learning-based methods only utilize a limited range of pixels and may easily miss self-similar textures for reconstruction. In this paper, we introduce a novel Geometric Distortion Guided Transformer for Omnidirectional image Super-Resolution (GDGT-OSR). Specifically, a distortion modulated rectangle-window self-attention mechanism, integrated with deformable self-attention, is proposed to better perceive the distortion and thus involve more self-similar textures. Distortion modulation is achieved through a newly devised distortion guidance generator that produces guidance by exploiting the variability of distortion across latitudes. Furthermore, we propose a dynamic feature aggregation scheme to adaptively fuse the features from different self-attention modules. We present extensive experimental results on public datasets and show that the new GDGT-OSR outperforms methods in existing literature.