Balancing Preservation and Modification: A Region and Semantic Aware Metric for Instruction-Based Image Editing

📅 2025-06-15
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Instruction-based image editing lacks automated evaluation metrics that jointly ensure modification accuracy and fidelity in unedited regions: existing methods either rely on costly human assessments or repurpose metrics from unrelated tasks, neglecting instruction-driven regional specificity. To address this, we propose BPM—the first end-to-end evaluation metric tailored to this task—featuring a novel region-aware and semantics-aware dual-criterion framework. First, we innovatively reverse-embed a SAM- and CLIP-based editing region localization module into the generative model to enhance localization precision. Second, we design a multi-granularity vision–language alignment assessment with a dual-branch interpretable scoring mechanism that explicitly decouples edited (relevant) and unedited (irrelevant) regions. On multiple benchmarks, BPM achieves state-of-the-art agreement with human judgments (Spearman ρ = 0.89), significantly outperforming prior metrics. The code is publicly available.

Technology Category

Application Category

📝 Abstract
Instruction-based image editing, which aims to modify the image faithfully according to the instruction while preserving irrelevant content unchanged, has made significant progress. However, there still lacks a comprehensive metric for assessing the editing quality. Existing metrics either require high human evaluation costs, which hinder large-scale evaluation, or are adapted from other tasks and lose task-specific concerns, failing to comprehensively evaluate both instruction-based modification and preservation of irrelevant regions, resulting in biased evaluation. To tackle this, we introduce a new metric called Balancing Preservation and Modification (BPM), tailored for instruction-based image editing by explicitly disentangling the image into editing-relevant and irrelevant regions for specific consideration. We first identify and locate editing-relevant regions, followed by a two-tier process to assess editing quality: Region-Aware Judge evaluates whether the position and size of the edited region align with the instruction, and Semantic-Aware Judge further assesses the instruction content compliance within editing-relevant regions as well as content preservation within irrelevant regions, yielding comprehensive and interpretable quality assessment. Moreover, the editing-relevant region localization in BPM can be integrated into image editing approaches to improve editing quality, demonstrating its broad applicability. We verify the effectiveness of the BPM metric on comprehensive instruction-editing data, and the results show the highest alignment with human evaluation compared to existing metrics, indicating its efficacy. Code is available at: https://joyli-x.github.io/BPM/
Problem

Research questions and friction points this paper is trying to address.

Lacks comprehensive metric for instruction-based image editing quality
Existing metrics fail to assess modification and preservation simultaneously
Proposes BPM metric to evaluate region and semantic alignment
Innovation

Methods, ideas, or system contributions that make the work stand out.

Region and semantic aware metric BPM
Two-tier process for editing quality
Editing-relevant region localization integration
🔎 Similar Papers
No similar papers found.
Z
Zhuoying Li
Wangxuan Institute of Computer Technology, Peking University
Zhu Xu
Zhu Xu
Peking University
Y
Yuxin Peng
Wangxuan Institute of Computer Technology, Peking University
Y
Yang Liu
Wangxuan Institute of Computer Technology, Peking University