AI Outperforms Humans in Personalized Image Aesthetics Assessment via LLM-Based Interviews and Semantic Feature Extraction

📅 2026-05-14

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Personalized image aesthetic assessment remains highly challenging due to the strong subjectivity of aesthetic preferences. This work proposes a novel approach that integrates deep learning with large language models (LLMs), leveraging an LLM to guide semi-structured interviews for actively eliciting users’ aesthetic preferences. These preferences are then combined with both low-level and high-level semantic features of images to construct a personalized aesthetic evaluation model. The resulting system significantly outperforms conventional methods, human raters, and users’ delayed self-assessments in prediction accuracy, achieving prediction errors lower than individuals’ intrinsic variability—particularly excelling on high-rated images. This represents a notable breakthrough wherein AI surpasses human-level performance in personalized aesthetic assessment.

📝 Abstract

Accurately predicting individual aesthetic evaluation for images is a fundamental challenge for AI. Various deep learning (DL)-based models have been proposed for this task, training on image evaluation data to extract objective low-level features. However, aesthetic preferences are inherently subjective and individual-dependent. Accurate prediction thus requires the extraction of high-level semantic features of images and the active collection of preference information from the target individual. To address this issue, we focus on the utility of Large Language Models (LLMs) pretrained on vast amounts of textual data, and develop an integrated DL-LLM system. The system actively elicits aesthetic preferences through LLM-based semi-structured interviews and predicts aesthetic evaluation by leveraging both low-level and high-level features. In our experiments, we compare the proposed system against conventional systems, human predictors, and the target individual's own re-evaluations after a certain time interval. Our results show that the proposed system outperforms all of them, with particularly strong performance on highly-rated images. Moreover, the prediction error of the proposed system is smaller than within-person variability, while human predictors show the largest error, likely due to the influence of their own aesthetic values. These results suggest that AI may be better positioned than others or one's future self to capture individual aesthetic preferences at a given point. This opens a new question of whether AI could serve as a deeper interpreter of human aesthetic sensibility than humans themselves.

Problem

Research questions and friction points this paper is trying to address.

personalized image aesthetics

aesthetic preference prediction

subjective evaluation

individual-dependent aesthetics

image aesthetic assessment

Innovation

Methods, ideas, or system contributions that make the work stand out.

personalized aesthetics

large language models

semantic feature extraction