HiRQA: Hierarchical Ranking and Quality Alignment for Opinion-Unaware Image Quality Assessment

📅 2025-08-20

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address severe dataset bias and poor generalizability in no-reference image quality assessment (NR-IQA) stemming from reliance on subjective human ratings, this paper proposes HiRQA: a self-supervised, label-free hierarchical quality assessment framework. Methodologically, HiRQA innovatively integrates high-order ranking loss with embedding distance loss, and introduces a text-guided contrastive alignment mechanism to achieve perceptually consistent quality ranking and cross-modal feature alignment—without requiring reference images or external modalities. Critically, it is trained solely on synthetically distorted data yet generalizes effectively to real-world degradation scenarios. HiRQA achieves state-of-the-art performance across multiple synthetic and authentic distortion benchmarks. Its lightweight variant, HiRQA-S, processes a single image in just 3.5 ms, demonstrating both strong generalization capability and practical deployability.

Technology Category

Application Category

📝 Abstract

Despite significant progress in no-reference image quality assessment (NR-IQA), dataset biases and reliance on subjective labels continue to hinder their generalization performance. We propose HiRQA, Hierarchical Ranking and Quality Alignment), a self-supervised, opinion-unaware framework that offers a hierarchical, quality-aware embedding through a combination of ranking and contrastive learning. Unlike prior approaches that depend on pristine references or auxiliary modalities at inference time, HiRQA predicts quality scores using only the input image. We introduce a novel higher-order ranking loss that supervises quality predictions through relational ordering across distortion pairs, along with an embedding distance loss that enforces consistency between feature distances and perceptual differences. A training-time contrastive alignment loss, guided by structured textual prompts, further enhances the learned representation. Trained only on synthetic distortions, HiRQA generalizes effectively to authentic degradations, as demonstrated through evaluation on various distortions such as lens flare, haze, motion blur, and low-light conditions. For real-time deployment, we introduce extbf{HiRQA-S}, a lightweight variant with an inference time of only 3.5 ms per image. Extensive experiments across synthetic and authentic benchmarks validate HiRQA's state-of-the-art (SOTA) performance, strong generalization ability, and scalability.

Problem

Research questions and friction points this paper is trying to address.

Self-supervised NR-IQA without subjective labels

Generalizing to authentic image distortions

Real-time quality assessment with lightweight variant

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-supervised hierarchical ranking framework

Higher-order ranking loss for distortion pairs

Contrastive alignment with textual prompts

🔎 Similar Papers

Quality-Aware Image-Text Alignment for Real-World Image Quality Assessment