CLIP-DQA: Blindly Evaluating Dehazed Images from Global and Local Perspectives Using CLIP

📅 2025-02-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Blind dehazing image quality assessment (BDQA) faces challenges of being reference-free and suffering from scarce, small-scale annotated data. To address these issues, this paper introduces the CLIP model to BDQA for the first time and proposes a dual-branch prompt learning framework. It employs a global–local joint input mechanism to fuse multi-scale visual features and jointly fine-tunes CLIP’s vision and language encoders with quality-oriented, learnable prompts to enable end-to-end quality scoring. The method eliminates reliance on handcrafted features or large-scale dehazing quality assessment (DQA) annotations, thereby significantly alleviating data scarcity. Evaluated on two real-world, reference-free DQA benchmarks, our approach achieves substantial improvements over existing state-of-the-art methods. The source code is publicly available.

Technology Category

Application Category

📝 Abstract
Blind dehazed image quality assessment (BDQA), which aims to accurately predict the visual quality of dehazed images without any reference information, is essential for the evaluation, comparison, and optimization of image dehazing algorithms. Existing learning-based BDQA methods have achieved remarkable success, while the small scale of DQA datasets limits their performance. To address this issue, in this paper, we propose to adapt Contrastive Language-Image Pre-Training (CLIP), pre-trained on large-scale image-text pairs, to the BDQA task. Specifically, inspired by the fact that the human visual system understands images based on hierarchical features, we take global and local information of the dehazed image as the input of CLIP. To accurately map the input hierarchical information of dehazed images into the quality score, we tune both the vision branch and language branch of CLIP with prompt learning. Experimental results on two authentic DQA datasets demonstrate that our proposed approach, named CLIP-DQA, achieves more accurate quality predictions over existing BDQA methods. The code is available at https://github.com/JunFu1995/CLIP-DQA.
Problem

Research questions and friction points this paper is trying to address.

Blind dehazed image quality assessment
Global and local image perspectives
Adapting CLIP for quality prediction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts CLIP for BDQA tasks
Uses global and local image features
Tunes CLIP with prompt learning
🔎 Similar Papers
No similar papers found.
Y
Yirui Zeng
School of Computer Science and Informatics, Cardiff University, United Kingdom
J
Jun Fu
School of Computer Science and Informatics, Cardiff University, United Kingdom
Hadi Amirpour
Hadi Amirpour
University of Klagenfurt
Video CompressionQuality of ExperienceVideo StreamingMedical Image Processing
Huasheng Wang
Huasheng Wang
cardiff university
Image quality assessment
G
Guanghui Yue
School of Biomedical Engineering, Shenzhen University, China
Hantao Liu
Hantao Liu
Full Professor of Computer Science, Cardiff University
Artificial IntelligenceImage and Video ProcessingApplied PerceptionMedical Imaging
Y
Ying Chen
Alibaba Group
W
Wei Zhou
School of Computer Science and Informatics, Cardiff University, United Kingdom