Guiding Perception-Reasoning Closer to Human in Blind Image Quality Assessment

📅 2025-12-18

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the lack of human-level perceptual reasoning and judgment consistency in blind image quality assessment (BIQA) models. To this end, we propose a perception–reasoning cascaded framework that explicitly models the human cognitive chain—“sensory input → implicit reasoning → quality judgment”—as a learnable, self-consistent reasoning path. We further introduce a reinforcement learning reward mechanism grounded in self-generated quality descriptions, balancing alignment with human preferences and internal logical consistency. Our method integrates human annotations, natural language generation, and ROUGE-1-based interpretability evaluation to achieve end-to-end interpretable BIQA. Experiments demonstrate state-of-the-art performance: highest Pearson and Spearman correlation coefficients among existing methods; ROUGE-1 score of 0.512—significantly surpassing the baseline (0.443)—validating high fidelity to human reasoning chains.

Technology Category

Application Category

📝 Abstract

Humans assess image quality through a perception-reasoning cascade, integrating sensory cues with implicit reasoning to form self-consistent judgments. In this work, we investigate how a model can acquire both human-like and self-consistent reasoning capability for blind image quality assessment (BIQA). We first collect human evaluation data that capture several aspects of human perception-reasoning pipeline. Then, we adopt reinforcement learning, using human annotations as reward signals to guide the model toward human-like perception and reasoning. To enable the model to internalize self-consistent reasoning capability, we design a reward that drives the model to infer the image quality purely from self-generated descriptions. Empirically, our approach achieves score prediction performance comparable to state-of-the-art BIQA systems under general metrics, including Pearson and Spearman correlation coefficients. In addition to the rating score, we assess human-model alignment using ROUGE-1 to measure the similarity between model-generated and human perception-reasoning chains. On over 1,000 human-annotated samples, our model reaches a ROUGE-1 score of 0.512 (cf. 0.443 for baseline), indicating substantial coverage of human explanations and marking a step toward human-like interpretable reasoning in BIQA.

Problem

Research questions and friction points this paper is trying to address.

Develop a model for blind image quality assessment that mimics human perception-reasoning

Use reinforcement learning with human annotations to guide human-like and self-consistent reasoning

Achieve high performance in score prediction and alignment with human explanations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using reinforcement learning with human reward signals

Training model to infer quality from self-generated descriptions

Achieving high human-model alignment via ROUGE-1 metric

🔎 Similar Papers

Bridging the Gap Between Saliency Prediction and Image Quality Assessment