π€ AI Summary
Existing image editing quality assessment methods rely on handcrafted heuristic prompts, which struggle to generalize across diverse editing effects and fail to account for the continuity of the scoring space. To address these limitations, this work proposes DS-IEQA, a unified framework that adaptively learns evaluation criteria through a feedback-driven prompt optimization mechanism (FDMPO) and models the continuous structure of quality scores via a token-disentangled distance regression loss (TDRL). Leveraging a multimodal large language model, the proposed method achieves competitive performance without requiring additional training data, securing fourth place in Track 2 of the NTIRE 2026 X-AIGC Quality Assessment Challenge. This result demonstrates the frameworkβs effectiveness and strong generalization capability.
π Abstract
Recent advances in image editing have heightened the need for reliable Image Editing Quality Assessment (IEQA). Unlike traditional methods, IEQA requires complex reasoning over multimodal inputs and multi-dimensional assessments. Existing MLLM-based approaches often rely on human heuristic prompting, leading to two key limitations: rigid metric prompting and distance-agnostic score modeling. These issues hinder alignment with implicit human criteria and fail to capture the continuous structure of score spaces. To address this, we propose Define-and-Score Image Editing Quality Assessment (DS-IEQA), a unified framework that jointly learns evaluation criteria and score representations. Specifically, we introduce Feedback-Driven Metric Prompt Optimization (FDMPO) to automatically refine metric definitions via probabilistic feedback. Furthermore, we propose Token-Decoupled Distance Regression Loss (TDRL), which decouples numerical tokens from language modeling to explicitly model score continuity through expected distance minimization. Extensive experiments show our method's superior performance; it ranks 4th in the 2026 NTIRE X-AIGC Quality Assessment Track 2 without any additional training data.