Data Quality Matters: Quantifying Image Quality Impact on Machine Learning Performance

📅 2025-03-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address perception distortion induced by sensor data compression and virtualization in autonomous driving, this paper proposes a four-step quantitative framework: (1) constructing paired distorted image sets; (2) measuring image fidelity degradation using LPIPS, SSIM, and PSNR; (3) evaluating task-level performance degradation—specifically mAP reduction and increased localization error—on object detection models (YOLO, Faster R-CNN); and (4) establishing statistical correlations between image quality metrics and downstream task performance. This work is the first to quantitatively characterize the relationship between image distortion magnitude and model robustness degradation in autonomous driving perception tasks. Results show that LPIPS exhibits the strongest correlation with performance degradation (Spearman’s ρ > 0.92), significantly outperforming SSIM and PSNR. The framework establishes a reproducible, interpretable, data-quality-driven paradigm for robustness validation of machine learning systems in safety-critical perception applications.

Technology Category

Application Category

📝 Abstract
Precise perception of the environment is essential in highly automated driving systems, which rely on machine learning tasks such as object detection and segmentation. Compression of sensor data is commonly used for data handling, while virtualization is used for hardware-in-the-loop validation. Both methods can alter sensor data and degrade model performance. This necessitates a systematic approach to quantifying image validity. This paper presents a four-step framework to evaluate the impact of image modifications on machine learning tasks. First, a dataset with modified images is prepared to ensure one-to-one matching image pairs, enabling measurement of deviations resulting from compression and virtualization. Second, image deviations are quantified by comparing the effects of compression and virtualization against original camera-based sensor data. Third, the performance of state-of-the-art object detection models is analyzed to determine how altered input data affects perception tasks, including bounding box accuracy and reliability. Finally, a correlation analysis is performed to identify relationships between image quality and model performance. As a result, the LPIPS metric achieves the highest correlation between image deviation and machine learning performance across all evaluated machine learning tasks.
Problem

Research questions and friction points this paper is trying to address.

Quantify image quality impact on ML performance
Evaluate compression and virtualization effects on perception
Correlate image deviations with model accuracy metrics
Innovation

Methods, ideas, or system contributions that make the work stand out.

Four-step framework evaluates image modification impact
Quantifies deviations from compression and virtualization
Correlates image quality with model performance metrics
🔎 Similar Papers
No similar papers found.