๐ค AI Summary
Existing art image evaluation methods focus solely on static outputs, neglecting the dynamic, stage-wise nature of the painting process.
Method: This paper introduces the first systematic evaluation framework for painting processes, proposing PPADโa novel eight-dimensional expert-annotated dataset comprising both authentic and synthetic procedural sequencesโand PPJudge, a temporal-aware assessment model built upon a Transformer architecture. PPJudge incorporates learnable temporal positional encodings and a heterogeneous Mixture-of-Experts (MoE) module to jointly model image sequences and learn multiple artistic attributes.
Contribution/Results: Experiments demonstrate that PPJudge significantly outperforms state-of-the-art methods in evaluation accuracy, robustness, and human alignment. It serves as the first interpretable, high-fidelity assessment tool explicitly designed for painting processes, advancing computational creativity research and art education.
๐ Abstract
Artistic image assessment has become a prominent research area in computer vision. In recent years, the field has witnessed a proliferation of datasets and methods designed to evaluate the aesthetic quality of paintings. However, most existing approaches focus solely on static final images, overlooking the dynamic and multi-stage nature of the artistic painting process. To address this gap, we propose a novel framework for human-aligned assessment of painting processes. Specifically, we introduce the Painting Process Assessment Dataset (PPAD), the first large-scale dataset comprising real and synthetic painting process images, annotated by domain experts across eight detailed attributes. Furthermore, we present PPJudge (Painting Process Judge), a Transformer-based model enhanced with temporally-aware positional encoding and a heterogeneous mixture-of-experts architecture, enabling effective assessment of the painting process. Experimental results demonstrate that our method outperforms existing baselines in accuracy, robustness, and alignment with human judgment, offering new insights into computational creativity and art education.