Generalizable Origin Identification for Text-Guided Image-to-Image Diffusion Models

📅 2025-01-04
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
The misuse of text-to-image generative models—e.g., for disinformation, copyright infringement, and source obfuscation—poses critical challenges for provenance attribution. Method: This paper introduces ID² (“Origin Identification”), a novel task for cross-model generalizable image origin tracing. We construct OriPID, the first universal benchmark dataset covering multiple diffusion-based generative models. Theoretically grounded in variational autoencoder (VAE) embedding spaces, we propose a linear feature alignment method that minimizes the embedding distance between generated and original images, ensuring robustness across diverse models. We further design an end-to-end similarity retrieval framework. Results: Experiments demonstrate a 31.6% improvement in mean Average Precision (mAP) under cross-model evaluation, significantly outperforming state-of-the-art methods. To our knowledge, this is the first approach achieving high generalizability in image origin identification.

Technology Category

Application Category

📝 Abstract
Text-guided image-to-image diffusion models excel in translating images based on textual prompts, allowing for precise and creative visual modifications. However, such a powerful technique can be misused for spreading misinformation, infringing on copyrights, and evading content tracing. This motivates us to introduce the task of origin IDentification for text-guided Image-to-image Diffusion models (ID$^2$), aiming to retrieve the original image of a given translated query. A straightforward solution to ID$^2$ involves training a specialized deep embedding model to extract and compare features from both query and reference images. However, due to visual discrepancy across generations produced by different diffusion models, this similarity-based approach fails when training on images from one model and testing on those from another, limiting its effectiveness in real-world applications. To solve this challenge of the proposed ID$^2$ task, we contribute the first dataset and a theoretically guaranteed method, both emphasizing generalizability. The curated dataset, OriPID, contains abundant Origins and guided Prompts, which can be used to train and test potential IDentification models across various diffusion models. In the method section, we first prove the existence of a linear transformation that minimizes the distance between the pre-trained Variational Autoencoder (VAE) embeddings of generated samples and their origins. Subsequently, it is demonstrated that such a simple linear transformation can be generalized across different diffusion models. Experimental results show that the proposed method achieves satisfying generalization performance, significantly surpassing similarity-based methods ($+31.6%$ mAP), even those with generalization designs.
Problem

Research questions and friction points this paper is trying to address.

Image Attribution
Forgery Detection
Source Identification
Innovation

Methods, ideas, or system contributions that make the work stand out.

Image Origin Recognition
Cross-model Identification
Simple Linear Transformation
🔎 Similar Papers
2024-03-28IEEE Workshop/Winter Conference on Applications of Computer VisionCitations: 1
W
Wenhao Wang
University of Technology Sydney
Y
Yifan Sun
Baidu Inc.
Z
Zongxin Yang
Zhejiang University
Z
Zhentao Tan
Baidu Inc.
Zhengdong Hu
Zhengdong Hu
Zhejiang University
computer vision
Y
Yi Yang
Zhejiang University