Evaluating Dataset Watermarking for Fine-tuning Traceability of Customized Diffusion Models: A Comprehensive Benchmark and Removal Approach

📅 2025-11-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the lack of training-data provenance in diffusion model fine-tuning and the absence of standardized evaluation criteria for watermarking techniques, this paper introduces the first comprehensive evaluation framework for fine-tuning traceability. We propose a unified threat model to systematically assess existing watermarking methods across three critical dimensions: generality, transferability, and robustness—particularly under realistic adversarial attacks. Furthermore, we design a black-box watermark removal algorithm that operates without access to the original training data, enabling complete watermark erasure while preserving fine-tuned model performance. Experimental results reveal that current watermarking methods exhibit limited robustness under conventional benchmarks but are consistently vulnerable under practical threat scenarios. This work establishes a reproducible benchmark, advocates a more realistic evaluation paradigm, and delivers critical security insights—thereby advancing the development of trustworthy generative models.

Technology Category

Application Category

📝 Abstract
Recent fine-tuning techniques for diffusion models enable them to reproduce specific image sets, such as particular faces or artistic styles, but also introduce copyright and security risks. Dataset watermarking has been proposed to ensure traceability by embedding imperceptible watermarks into training images, which remain detectable in outputs even after fine-tuning. However, current methods lack a unified evaluation framework. To address this, this paper establishes a general threat model and introduces a comprehensive evaluation framework encompassing Universality, Transmissibility, and Robustness. Experiments show that existing methods perform well in universality and transmissibility, and exhibit some robustness against common image processing operations, yet still fall short under real-world threat scenarios. To reveal these vulnerabilities, the paper further proposes a practical watermark removal method that fully eliminates dataset watermarks without affecting fine-tuning, highlighting a key challenge for future research.
Problem

Research questions and friction points this paper is trying to address.

Evaluating dataset watermarking effectiveness for tracing customized diffusion models
Establishing unified benchmark framework for watermark universality and robustness
Developing removal method exposing vulnerabilities in current watermarking techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Establishes comprehensive evaluation framework for dataset watermarking
Proposes practical watermark removal method without affecting fine-tuning
Benchmarks watermarking methods across universality transmissibility robustness
X
Xincheng Wang
Donghua University
H
Hanchi Sun
Shanghai JiaoTong University
W
Wenjun Sun
Xidian University
K
Kejun Xue
Donghua University
W
Wangqiu Zhou
Hefei University of Technology
J
Jianbo Zhang
Shanghai JiaoTong University
W
Wei Sun
East China Normal University
D
Dandan Zhu
East China Normal University
X
Xiongkuo Min
Shanghai JiaoTong University
J
Jun Jia
Shanghai JiaoTong University
Zhijun Fang
Zhijun Fang
Donghua University, Shanghai University of Engineering Science
Computer VisionData AnalysisPattern RecognitionMultimedia Technology