🤖 AI Summary
Existing image quality assessment (IQA) methods predominantly produce a single holistic score, overlooking the multi-dimensional nature of human perception—encompassing both technical and aesthetic attributes. To address this limitation, we propose MDIQA, a novel multidimensional IQA framework that jointly models five technical dimensions (e.g., noise, blur) and four aesthetic dimensions (e.g., composition, color harmony). MDIQA employs a multi-branch deep network to extract dimension-specific features, followed by supervised feature fusion and jointly optimized, adjustable dimension weights. Crucially, it enables user-preference-driven training for customizable image restoration. Extensive experiments demonstrate that MDIQA achieves state-of-the-art performance across multiple mainstream IQA benchmarks. Moreover, when integrated into image restoration pipelines, it significantly enhances subjective visual quality, validating its effectiveness in real-world, preference-aware applications.
📝 Abstract
Recent advancements in image quality assessment (IQA), driven by sophisticated deep neural network designs, have significantly improved the ability to approach human perceptions. However, most existing methods are obsessed with fitting the overall score, neglecting the fact that humans typically evaluate image quality from different dimensions before arriving at an overall quality assessment. To overcome this problem, we propose a multi-dimensional image quality assessment (MDIQA) framework. Specifically, we model image quality across various perceptual dimensions, including five technical and four aesthetic dimensions, to capture the multifaceted nature of human visual perception within distinct branches. Each branch of our MDIQA is initially trained under the guidance of a separate dimension, and the respective features are then amalgamated to generate the final IQA score. Additionally, when the MDIQA model is ready, we can deploy it for a flexible training of image restoration (IR) models, enabling the restoration results to better align with varying user preferences through the adjustment of perceptual dimension weights. Extensive experiments demonstrate that our MDIQA achieves superior performance and can be effectively and flexibly applied to image restoration tasks. The code is available: https://github.com/YaoShunyu19/MDIQA.