CLIP-AGIQA: Boosting the Performance of AI-Generated Image Quality Assessment with CLIP

📅 2024-08-27

🏛️ International Conference on Pattern Recognition

📈 Citations: 5

✨ Influential: 0

career value

186K/year

🤖 AI Summary

To address the weak generalization and poor cross-category performance of existing models in AI-generated image (AIGI) quality assessment, this paper proposes CLIP-AGIQA—a CLIP-based end-to-end regression framework. Methodologically, it introduces a novel learnable multi-class textual prompting mechanism to fully exploit CLIP’s joint vision-language representation capability; it also systematically validates CLIP’s strong transferability and superiority for AIGI quality assessment. Evaluated on multiple benchmarks—including AGIQA-3K and AIGCIQA2023—CLIP-AGIQA achieves an average improvement of over 12% in Spearman rank-order correlation coefficient (SROCC) against state-of-the-art methods, establishing new SOTA performance. This work represents the first systematic integration of CLIP into no-reference, cross-domain, and highly generalizable AIGI quality assessment, offering a principled new paradigm for the task.

Technology Category

Application Category

📝 Abstract

With the rapid development of generative technologies, AI-Generated Images (AIGIs) have been widely applied in various aspects of daily life. However, due to the immaturity of the technology, the quality of the generated images varies, so it is important to develop quality assessment techniques for the generated images. Although some models have been proposed to assess the quality of generated images, they are inadequate when faced with the ever-increasing and diverse categories of generated images. Consequently, the development of more advanced and effective models for evaluating the quality of generated images is urgently needed. Recent research has explored the significant potential of the visual language model CLIP in image quality assessment, finding that it performs well in evaluating the quality of natural images. However, its application to generated images has not been thoroughly investigated. In this paper, we build on this idea and further explore the potential of CLIP in evaluating the quality of generated images. We design CLIP-AGIQA, a CLIP-based regression model for quality assessment of generated images, leveraging rich visual and textual knowledge encapsulated in CLIP. Particularly, we implement multi-category learnable prompts to fully utilize the textual knowledge in CLIP for quality assessment. Extensive experiments on several generated image quality assessment benchmarks, including AGIQA-3K and AIGCIQA2023, demonstrate that CLIP-AGIQA outperforms existing IQA models, achieving excellent results in evaluating the quality of generated images.

Problem

Research questions and friction points this paper is trying to address.

Assessing quality of diverse AI-generated images effectively

Exploring CLIP's potential for generated image evaluation

Developing advanced models for consistent AIGI quality assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

CLIP-based regression model for image quality

Multi-category learnable prompts utilization

Leveraging CLIP's visual and textual knowledge

🔎 Similar Papers

No similar papers found.