From Filters to VLMs: Benchmarking Defogging Methods through Object Detection and Segmentation Performance

📅 2025-10-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Fog severely degrades perception performance in autonomous driving, yet existing dehazing methods often improve image fidelity without corresponding gains in downstream detection/segmentation tasks; moreover, most evaluations rely on synthetic data, raising concerns about generalizability. This paper introduces the first perception-oriented, transparent dehazing benchmark, systematically evaluating traditional filters, deep dehazing networks, cascade strategies (filter↔model), and prompt-based vision-language models (VLMs) as both image editors and quality evaluators. Key contributions include: (1) the first use of VLMs for dehazing assessment, revealing strong correlation (r > 0.92) between VLM scores and detection mAP; and (2) empirical analysis on Foggy Cityscapes that delineates method-specific applicability boundaries, synergistic benefits, and degradation conditions—establishing a reproducible, interpretable evaluation paradigm for perception-driven dehazing.

Technology Category

Application Category

📝 Abstract
Autonomous driving perception systems are particularly vulnerable in foggy conditions, where light scattering reduces contrast and obscures fine details critical for safe operation. While numerous defogging methods exist-from handcrafted filters to learned restoration models-improvements in image fidelity do not consistently translate into better downstream detection and segmentation. Moreover, prior evaluations often rely on synthetic data, leaving questions about real-world transferability. We present a structured empirical study that benchmarks a comprehensive set of pipelines, including (i) classical filters, (ii) modern defogging networks, (iii) chained variants (filter$ ightarrow$model, model$ ightarrow$filter), and (iv) prompt-driven visual--language image editing models (VLM) applied directly to foggy images. Using Foggy Cityscapes, we assess both image quality and downstream performance on object detection (mAP) and segmentation (PQ, RQ, SQ). Our analysis reveals when defogging helps, when chaining yields synergy or degradation, and how VLM-based editors compare to dedicated approaches. In addition, we evaluate qualitative rubric-based scores from a VLM judge and quantify their alignment with task metrics, showing strong correlations with mAP. Together, these results establish a transparent, task-oriented benchmark for defogging methods and highlight the conditions under which preprocessing genuinely improves autonomous perception in adverse weather.
Problem

Research questions and friction points this paper is trying to address.

Evaluating defogging methods' impact on object detection and segmentation
Assessing real-world transferability beyond synthetic fog data
Determining when preprocessing improves autonomous driving perception
Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarking classical filters and modern defogging networks
Evaluating chained filter-model pipelines for performance synergy
Assessing VLM-based editors against dedicated defogging methods
🔎 Similar Papers
No similar papers found.
A
Ardalan Aryashad
University of Southern California
Parsa Razmara
Parsa Razmara
University of Southern California
deep learningsignal processingimage processingmedical imagingmachine learning
A
Amin Mahjoub
University of Southern California
Seyedarmin Azizi
Seyedarmin Azizi
University of Southern California
Machine LearningEfficient Deep LearningLLMsModel Compression
M
Mahdi Salmani
University of Southern California
A
Arad Firouzkouhi
University of Southern California