Exploring Semantic Feature Discrimination for Perceptual Image Super-Resolution and Opinion-Unaware No-Reference Image Quality Assessment

📅 2025-03-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address insufficient semantic-texture modeling in perceptual super-resolution (SR) and the limited generalizability of no-reference image quality assessment (NR-IQA), this paper proposes the Semantic Feature Discrimination (SFD) framework. Methodologically, SFD introduces CLIP’s multi-level semantic features into SR adversarial training for the first time, jointly employing pixel-aligned mid-level feature discrimination (Feat-D) and text-guided high-level abstract discrimination (TG-D). A learnable prompt pair (LPP) is designed to enable text–image semantic alignment for adversarial optimization. Furthermore, SFD derives SFD-IQA—a fully unsupervised, opinion-unaware NR-IQA metric—from the same discriminator without additional training. Experiments demonstrate that SFD achieves competitive PSNR/SSIM on both classical and real-world SR benchmarks while significantly improving LPIPS. SFD-IQA attains an average 12.7% gain in Spearman rank correlation coefficient (SROCC) across multiple opinion-unaware NR-IQA benchmarks, outperforming existing unsupervised methods.

Technology Category

Application Category

📝 Abstract
Generative Adversarial Networks (GANs) have been widely applied to image super-resolution (SR) to enhance the perceptual quality. However, most existing GAN-based SR methods typically perform coarse-grained discrimination directly on images and ignore the semantic information of images, making it challenging for the super resolution networks (SRN) to learn fine-grained and semantic-related texture details. To alleviate this issue, we propose a semantic feature discrimination method, SFD, for perceptual SR. Specifically, we first design a feature discriminator (Feat-D), to discriminate the pixel-wise middle semantic features from CLIP, aligning the feature distributions of SR images with that of high-quality images. Additionally, we propose a text-guided discrimination method (TG-D) by introducing learnable prompt pairs (LPP) in an adversarial manner to perform discrimination on the more abstract output feature of CLIP, further enhancing the discriminative ability of our method. With both Feat-D and TG-D, our SFD can effectively distinguish between the semantic feature distributions of low-quality and high-quality images, encouraging SRN to generate more realistic and semantic-relevant textures. Furthermore, based on the trained Feat-D and LPP, we propose a novel opinion-unaware no-reference image quality assessment (OU NR-IQA) method, SFD-IQA, greatly improving OU NR-IQA performance without any additional targeted training. Extensive experiments on classical SISR, real-world SISR, and OU NR-IQA tasks demonstrate the effectiveness of our proposed methods.
Problem

Research questions and friction points this paper is trying to address.

Enhance perceptual image super-resolution with semantic feature discrimination
Improve fine-grained texture details using CLIP-based feature alignment
Develop opinion-unaware no-reference image quality assessment without targeted training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Semantic feature discrimination for image SR
Text-guided discrimination with learnable prompts
Opinion-unaware NR-IQA using trained features
🔎 Similar Papers
No similar papers found.
Guanglu Dong
Guanglu Dong
四川大学
Super-ResolutionImage RestorationLow-Level Computer Vision
X
Xiangyu Liao
College of Electronics and Information Engineering, Sichuan University, China
M
Mingyang Li
College of Electronics and Information Engineering, Sichuan University, China
G
Guihuan Guo
College of Electronics and Information Engineering, Sichuan University, China
C
Chao Ren
College of Electronics and Information Engineering, Sichuan University, China