🤖 AI Summary
Traditional image quality assessment (IQA) focuses primarily on technical distortions, neglecting subjective aesthetic dimensions such as stylistic preferences—thus limiting progress in rendering aesthetics research. To address this gap, we introduce the novel task of Image Rendering Aesthetic Assessment (EAR). We present DEAR, the first large-scale subjective preference benchmark dataset: built upon MIT-Adobe FiveK, it comprises 25 image pairs annotated via pairwise preference judgments from 13,648 crowdworkers, with each pair rated multiple times by independent annotators. We further propose a context-sensitive visual style annotation protocol and employ statistical modeling to quantify aesthetic differences. DEAR enables three key downstream tasks: style preference prediction, personalized aesthetic modeling, and evaluation of AI-generated imagery. A subset of 100 images is publicly released on HuggingFace, establishing a reproducible benchmark for rendering aesthetics research.
📝 Abstract
Traditional Image Quality Assessment~(IQA) focuses on quantifying technical degradations such as noise, blur, or compression artifacts, using both full-reference and no-reference objective metrics. However, evaluation of rendering aesthetics, a growing domain relevant to photographic editing, content creation, and AI-generated imagery, remains underexplored due to the lack of datasets that reflect the inherently subjective nature of style preference. In this work, a novel benchmark dataset designed to model human aesthetic judgments of image rendering styles is introduced: the Dataset for Evaluating the Aesthetics of Rendering (DEAR). Built upon the MIT-Adobe FiveK dataset, DEAR incorporates pairwise human preference scores collected via large-scale crowdsourcing, with each image pair evaluated by 25 distinct human evaluators with a total of 13,648 of them participating overall. These annotations capture nuanced, context-sensitive aesthetic preferences, enabling the development and evaluation of models that go beyond traditional distortion-based IQA, focusing on a new task: Evaluation of Aesthetics of Rendering (EAR). The data collection pipeline is described, human voting patterns are analyzed, and multiple use cases are outlined, including style preference prediction, aesthetic benchmarking, and personalized aesthetic modeling. To the best of the authors' knowledge, DEAR is the first dataset to systematically address image aesthetics of rendering assessment grounded in subjective human preferences. A subset of 100 images with markup for them is published on HuggingFace (huggingface.co/datasets/vsevolodpl/DEAR).