🤖 AI Summary
Existing image quality assessment (IQA) metrics are designed for human visual perception and exhibit low sensitivity to variations in deep neural network (DNN) performance, thus failing to reliably predict model behavior on real-world data.
Method: This paper pioneers a causal re-framing of IQA, establishing a formal causal framework linking image quality to DNN robustness. We propose a novel, task-agnostic quality metric that estimates dataset difficulty without requiring training of downstream task models, leveraging causal inference and statistical dependence modeling.
Contribution/Results: We theoretically prove and empirically validate that mainstream IQA metrics possess weak predictive power for classification accuracy. On multiple benchmark datasets, our metric achieves an average improvement of 0.42 in Spearman’s ρ correlation with DNN accuracy—substantially enhancing the predictability of model performance under realistic image degradations.
📝 Abstract
Image quality plays an important role in the performance of deep neural networks (DNNs) and DNNs have been widely shown to exhibit sensitivity to changes in imaging conditions. Large-scale datasets often contain images under a wide range of conditions prompting a need to quantify and understand their underlying quality distribution in order to better characterize DNN performance and robustness. Aligning the sensitivities of image quality metrics and DNNs ensures that estimates of quality can act as proxies for image/dataset difficulty independent of the task models trained/evaluated on the data. Conventional image quality assessment (IQA) seeks to measure and align quality relative to human perceptual judgments, but here we seek a quality measure that is not only sensitive to imaging conditions but also well-aligned with DNN sensitivities. We first ask whether conventional IQA metrics are also informative of DNN performance. In order to answer this question, we reframe IQA from a causal perspective and examine conditions under which quality metrics are predictive of DNN performance. We show theoretically and empirically that current IQA metrics are weak predictors of DNN performance in the context of classification. We then use our causal framework to provide an alternative formulation and a new image quality metric that is more strongly correlated with DNN performance and can act as a prior on performance without training new task models. Our approach provides a means to directly estimate the quality distribution of large-scale image datasets towards characterizing the relationship between dataset composition and DNN performance.