UGD-IML: A Unified Generative Diffusion-based Framework for Constrained and Unconstrained Image Manipulation Localization

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

career value

202K/year

🤖 AI Summary

Digital image forgery has proliferated in the digital era, yet existing image manipulation localization (IML) and copy-move IML (CIML) methods heavily rely on large-scale, pixel-accurate annotations and suffer from limited training data, complex pipelines, and low efficiency. This paper pioneers the integration of generative diffusion models into image manipulation localization, proposing a unified framework that supports end-to-end, pixel-level localization—both under constrained and unconstrained settings—via class-embedding guidance and parameter sharing, without auxiliary modules or additional training overhead. The model directly learns the distribution of authentic images, enabling high-precision localization alongside intrinsic uncertainty estimation. Evaluated across multiple benchmarks, it achieves average F1-score improvements of +9.66 (IML) and +4.36 (CIML) over state-of-the-art methods, while demonstrating strong robustness and interpretable, visualization-capable localization outputs.

Technology Category

Application Category

📝 Abstract

In the digital age, advanced image editing tools pose a serious threat to the integrity of visual content, making image forgery detection and localization a key research focus. Most existing Image Manipulation Localization (IML) methods rely on discriminative learning and require large, high-quality annotated datasets. However, current datasets lack sufficient scale and diversity, limiting model performance in real-world scenarios. To overcome this, recent studies have explored Constrained IML (CIML), which generates pixel-level annotations through algorithmic supervision. However, existing CIML approaches often depend on complex multi-stage pipelines, making the annotation process inefficient. In this work, we propose a novel generative framework based on diffusion models, named UGD-IML, which for the first time unifies both IML and CIML tasks within a single framework. By learning the underlying data distribution, generative diffusion models inherently reduce the reliance on large-scale labeled datasets, allowing our approach to perform effectively even under limited data conditions. In addition, by leveraging a class embedding mechanism and a parameter-sharing design, our model seamlessly switches between IML and CIML modes without extra components or training overhead. Furthermore, the end-to-end design enables our model to avoid cumbersome steps in the data annotation process. Extensive experimental results on multiple datasets demonstrate that UGD-IML outperforms the SOTA methods by an average of 9.66 and 4.36 in terms of F1 metrics for IML and CIML tasks, respectively. Moreover, the proposed method also excels in uncertainty estimation, visualization and robustness.

Problem

Research questions and friction points this paper is trying to address.

Detects and localizes manipulated images using diffusion models

Unifies constrained and unconstrained image manipulation localization tasks

Reduces reliance on large annotated datasets through generative learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Unifies IML and CIML with diffusion models

Reduces reliance on large labeled datasets

Enables end-to-end annotation without overhead

🔎 Similar Papers

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization