🤖 AI Summary
Geometric distortions cause pixel-level misalignment between reference and distorted images, rendering conventional full-reference image quality assessment (FR-IQA) methods ineffective. To address this, we propose DeepSSIM—a training-free, task-agnostic IQA framework leveraging pre-trained CNNs to extract multi-layer deep features. Structural similarity is computed via sliding windows over these features, while a learnable channel-wise attention mechanism mitigates feature distribution shifts, enabling robust quality estimation under both aligned and non-aligned conditions. Our key contributions are: (i) the first unified geometric-distortion-robust (GDR)-IQA paradigm that eliminates reliance on explicit registration, geometric modeling, or task-specific architectures; and (ii) a novel structural consistency metric formulated directly in deep feature space. Experiments demonstrate state-of-the-art performance on standard AR-IQA benchmarks and superior accuracy across diverse geometric distortions—including scaling, rotation, perspective, and elastic deformation. Moreover, DeepSSIM serves effectively as a differentiable perceptual loss for training super-resolution, enhancement, and restoration models.
📝 Abstract
Image Quality Assessment (IQA) with references plays an important role in optimizing and evaluating computer vision tasks. Traditional methods assume that all pixels of the reference and test images are fully aligned. Such Aligned-Reference IQA (AR-IQA) approaches fail to address many real-world problems with various geometric deformations between the two images. Although significant effort has been made to attack Geometrically-Disparate-Reference IQA (GDR-IQA) problem, it has been addressed in a task-dependent fashion, for example, by dedicated designs for image super-resolution and retargeting, or by assuming the geometric distortions to be small that can be countered by translation-robust filters or by explicit image registrations. Here we rethink this problem and propose a unified, non-training-based Deep Structural Similarity (DeepSSIM) approach to address the above problems in a single framework, which assesses structural similarity of deep features in a simple but efficient way and uses an attention calibration strategy to alleviate attention deviation. The proposed method, without application-specific design, achieves state-of-the-art performance on AR-IQA datasets and meanwhile shows strong robustness to various GDR-IQA test cases. Interestingly, our test also shows the effectiveness of DeepSSIM as an optimization tool for training image super-resolution, enhancement and restoration, implying an even wider generalizability. footnote{Source code will be made public after the review is completed.