VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank

📅 2025-05-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address weak generalization and difficulty in fine-grained modeling in no-reference image quality assessment (NR-IQA), this paper proposes the first reasoning-driven reinforcement learning ranking framework. Methodologically, it integrates vision-language joint reasoning with groupwise relative policy optimization (GRPO), models pairwise comparison probabilities via the Thurstone model, and replaces conventional binary preference labels with a continuous reward mechanism. Contributions include: (1) introducing the first reasoning-augmented NR-IQA paradigm; (2) enabling joint training across multiple datasets without requiring perceptual scale alignment; and (3) simultaneously generating fine-grained quality scores and human-aligned natural language descriptions. Experiments demonstrate significant improvements over discriminative models and state-of-the-art reasoning-based regression methods on mainstream NR-IQA benchmarks, while exhibiting strong cross-task generalization in super-resolution, generative modeling, and other quality evaluation scenarios.

Technology Category

Application Category

📝 Abstract
DeepSeek-R1 has demonstrated remarkable effectiveness in incentivizing reasoning and generalization capabilities of large language models (LLMs) through reinforcement learning. Nevertheless, the potential of reasoning-induced computational modeling has not been thoroughly explored in the context of image quality assessment (IQA), a task critically dependent on visual reasoning. In this paper, we introduce VisualQuality-R1, a reasoning-induced no-reference IQA (NR-IQA) model, and we train it with reinforcement learning to rank, a learning algorithm tailored to the intrinsically relative nature of visual quality. Specifically, for a pair of images, we employ group relative policy optimization to generate multiple quality scores for each image. These estimates are then used to compute comparative probabilities of one image having higher quality than the other under the Thurstone model. Rewards for each quality estimate are defined using continuous fidelity measures rather than discretized binary labels. Extensive experiments show that the proposed VisualQuality-R1 consistently outperforms discriminative deep learning-based NR-IQA models as well as a recent reasoning-induced quality regression method. Moreover, VisualQuality-R1 is capable of generating contextually rich, human-aligned quality descriptions, and supports multi-dataset training without requiring perceptual scale realignment. These features make VisualQuality-R1 especially well-suited for reliably measuring progress in a wide range of image processing tasks like super-resolution and image generation.
Problem

Research questions and friction points this paper is trying to address.

Develop reasoning-induced image quality assessment model
Train model using reinforcement learning for ranking
Generate human-aligned quality descriptions without scale realignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning for image quality ranking
Group relative policy optimization for scores
Continuous fidelity measures as rewards
🔎 Similar Papers
No similar papers found.