ServImage: An Image Generation and Editing Benchmark from Real-world Commercial Imaging Services

📅 2026-04-27

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the unclear commercial value of current image generation models in real-world design scenarios by proposing the first framework that directly links image quality assessment to human payment decisions. The authors introduce ServImageBench, a dataset comprising 1.07k commercial tasks and 2.05k deliverables, along with ServImageScore—a multidimensional evaluation metric encompassing functional requirements, visual quality, and business necessity. Leveraging 33k human-annotated images, they train ServImageModel, a payment prediction model that achieves 82.00% accuracy in forecasting whether users are willing to pay for a given image. The model further outputs calibrated payment probabilities, offering an effective quantitative measure of an image’s commercial viability.

Technology Category

Application Category

📝 Abstract

Recent image generation and editing models demonstrate robust adherence to instructions and high visual quality on academic benchmarks. However, their performance on paid, real-world design projects remains uncertain. We introduce \textbf{ServImage}, a benchmark that explicitly correlates model outputs with economic value in commercial design projects. ServImage consists of (i) \textbf{\textit{ServImageBench}}: a dataset of 1.07k paid commercial design tasks and 2.05k designer deliverables totaling over \$295k, covering portrait, product, and digital content, along with 33k candidate images and 33k human annotations. (ii) \textbf{\textit{ServImageScore}}: an integrated scoring system that combines three quality dimensions: baseline requirements fulfilment, visual execution quality, and commercial necessity satisfaction. These three dimensions are designed to characterize the factors that drive human payment decisions and indicate whether an image is commercially acceptable. (iii) \textbf{\textit{ServImageModel}}: under this scoring system, we propose a payment prediction model trained on the human-annotated candidate images, achieving 82.00\% accuracy in predicting human payment decisions and producing calibrated payment probabilities. ServImage provides a comprehensive foundation for assessing the commercial viability of image generation models and offers a scalable resource for future research on economically grounded vision systems \href{https://github.com/FengxianJi/ServImage}{Github.}

Problem

Research questions and friction points this paper is trying to address.

image generation

commercial benchmark

economic value

visual quality

human payment decisions

Innovation

Methods, ideas, or system contributions that make the work stand out.

commercial benchmark

image generation

economic value