CLIP-DQA: Blindly Evaluating Dehazed Images from Global and Local Perspectives Using CLIP

📅 2025-02-03

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Blind dehazing image quality assessment (BDQA) faces challenges of being reference-free and suffering from scarce, small-scale annotated data. To address these issues, this paper introduces the CLIP model to BDQA for the first time and proposes a dual-branch prompt learning framework. It employs a global–local joint input mechanism to fuse multi-scale visual features and jointly fine-tunes CLIP’s vision and language encoders with quality-oriented, learnable prompts to enable end-to-end quality scoring. The method eliminates reliance on handcrafted features or large-scale dehazing quality assessment (DQA) annotations, thereby significantly alleviating data scarcity. Evaluated on two real-world, reference-free DQA benchmarks, our approach achieves substantial improvements over existing state-of-the-art methods. The source code is publicly available.

Technology Category

Application Category

📝 Abstract

Blind dehazed image quality assessment (BDQA), which aims to accurately predict the visual quality of dehazed images without any reference information, is essential for the evaluation, comparison, and optimization of image dehazing algorithms. Existing learning-based BDQA methods have achieved remarkable success, while the small scale of DQA datasets limits their performance. To address this issue, in this paper, we propose to adapt Contrastive Language-Image Pre-Training (CLIP), pre-trained on large-scale image-text pairs, to the BDQA task. Specifically, inspired by the fact that the human visual system understands images based on hierarchical features, we take global and local information of the dehazed image as the input of CLIP. To accurately map the input hierarchical information of dehazed images into the quality score, we tune both the vision branch and language branch of CLIP with prompt learning. Experimental results on two authentic DQA datasets demonstrate that our proposed approach, named CLIP-DQA, achieves more accurate quality predictions over existing BDQA methods. The code is available at https://github.com/JunFu1995/CLIP-DQA.

Problem

Research questions and friction points this paper is trying to address.

Blind dehazed image quality assessment

Global and local image perspectives

Adapting CLIP for quality prediction

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts CLIP for BDQA tasks

Uses global and local image features

Tunes CLIP with prompt learning

🔎 Similar Papers

HazeCLIP: Towards Language Guided Real-World Image Dehazing