CLIP-AGIQA: Boosting the Performance of AI-Generated Image Quality Assessment with CLIP

📅 2024-08-27
🏛️ International Conference on Pattern Recognition
📈 Citations: 5
Influential: 0
📄 PDF
🤖 AI Summary
To address the weak generalization and poor cross-category performance of existing models in AI-generated image (AIGI) quality assessment, this paper proposes CLIP-AGIQA—a CLIP-based end-to-end regression framework. Methodologically, it introduces a novel learnable multi-class textual prompting mechanism to fully exploit CLIP’s joint vision-language representation capability; it also systematically validates CLIP’s strong transferability and superiority for AIGI quality assessment. Evaluated on multiple benchmarks—including AGIQA-3K and AIGCIQA2023—CLIP-AGIQA achieves an average improvement of over 12% in Spearman rank-order correlation coefficient (SROCC) against state-of-the-art methods, establishing new SOTA performance. This work represents the first systematic integration of CLIP into no-reference, cross-domain, and highly generalizable AIGI quality assessment, offering a principled new paradigm for the task.

Technology Category

Application Category

📝 Abstract
With the rapid development of generative technologies, AI-Generated Images (AIGIs) have been widely applied in various aspects of daily life. However, due to the immaturity of the technology, the quality of the generated images varies, so it is important to develop quality assessment techniques for the generated images. Although some models have been proposed to assess the quality of generated images, they are inadequate when faced with the ever-increasing and diverse categories of generated images. Consequently, the development of more advanced and effective models for evaluating the quality of generated images is urgently needed. Recent research has explored the significant potential of the visual language model CLIP in image quality assessment, finding that it performs well in evaluating the quality of natural images. However, its application to generated images has not been thoroughly investigated. In this paper, we build on this idea and further explore the potential of CLIP in evaluating the quality of generated images. We design CLIP-AGIQA, a CLIP-based regression model for quality assessment of generated images, leveraging rich visual and textual knowledge encapsulated in CLIP. Particularly, we implement multi-category learnable prompts to fully utilize the textual knowledge in CLIP for quality assessment. Extensive experiments on several generated image quality assessment benchmarks, including AGIQA-3K and AIGCIQA2023, demonstrate that CLIP-AGIQA outperforms existing IQA models, achieving excellent results in evaluating the quality of generated images.
Problem

Research questions and friction points this paper is trying to address.

Assessing quality of diverse AI-generated images effectively
Exploring CLIP's potential for generated image evaluation
Developing advanced models for consistent AIGI quality assessment
Innovation

Methods, ideas, or system contributions that make the work stand out.

CLIP-based regression model for image quality
Multi-category learnable prompts utilization
Leveraging CLIP's visual and textual knowledge
🔎 Similar Papers
No similar papers found.
Z
Zhenchen Tang
New Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
Z
Zichuan Wang
New Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
B
Bo Peng
New Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
J
Jing Dong
New Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences