🤖 AI Summary
Existing full-reference image quality assessment (FR-IQA) metrics suffer from poor perceptual consistency and weak distortion modeling in super-resolution (SR) tasks. To address this, we propose a perception-guided bidirectional attention network (PBA-Net). Leveraging characteristics of the human visual system, PBA-Net employs a bidirectional attention mechanism to jointly model distortion dependencies across spatial and channel dimensions. It further incorporates grouped multi-scale deformable convolution and sub-information excitation convolution to adaptively capture pixel-level and feature-level distortions. The framework is formulated as an end-to-end quality regression model, eliminating the need for handcrafted distortion priors. Extensive experiments on SR-specific benchmarks—including LIVE-SR and MSSIM-SR—demonstrate that PBA-Net significantly outperforms state-of-the-art methods, achieving an average 23.6% improvement in Pearson Linear Correlation Coefficient (PLCC). These results validate its superiority and generalizability for SR quality prediction.
📝 Abstract
Many super-resolution (SR) algorithms have been proposed to increase image resolution. However, full-reference (FR) image quality assessment (IQA) metrics for comparing and evaluating different SR algorithms are limited. In this work, we propose the Perception-oriented Bidirectional Attention Network (PBAN) for image SR FR-IQA, which is composed of three modules: an image encoder module, a perception-oriented bidirectional attention (PBA) module, and a quality prediction module. First, we encode the input images for feature representations. Inspired by the characteristics of the human visual system, we then construct the perception-oriented PBA module. Specifically, different from existing attention-based SR IQA methods, we conceive a Bidirectional Attention to bidirectionally construct visual attention to distortion, which is consistent with the generation and evaluation processes of SR images. To further guide the quality assessment towards the perception of distorted information, we propose Grouped Multi-scale Deformable Convolution, enabling the proposed method to adaptively perceive distortion. Moreover, we design Sub-information Excitation Convolution to direct visual perception to both sub-pixel and sub-channel attention. Finally, the quality prediction module is exploited to integrate quality-aware features and regress quality scores. Extensive experiments demonstrate that our proposed PBAN outperforms state-of-the-art quality assessment methods.