Smudged Fingerprints: A Systematic Evaluation of the Robustness of AI Image Fingerprints

📅 2025-12-12

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

This work systematically evaluates the robustness of AI-generated image fingerprinting techniques against adversarial attacks, focusing on two primary threats: fingerprint removal and forgery. We propose the first unified security evaluation framework, incorporating both white-box and black-box threat models, and jointly analyzing attacks across RGB pixel, frequency, and deep feature domains. Our framework implements five distinct attack strategies and benchmarks 14 state-of-the-art fingerprinting methods across 12 advanced generative models. Results reveal a fundamental accuracy–robustness trade-off: no existing method achieves both high detection accuracy and strong adversarial resilience across all threat scenarios. White-box removal attacks succeed with >80% success rate; black-box variants remain effective (>50%). Forgery performance exhibits strong model dependency. This study establishes a rigorous benchmark and theoretical foundation for characterizing security boundaries and guiding robustness improvements in image fingerprinting.

Technology Category

Application Category

📝 Abstract

Model fingerprint detection techniques have emerged as a promising approach for attributing AI-generated images to their source models, but their robustness under adversarial conditions remains largely unexplored. We present the first systematic security evaluation of these techniques, formalizing threat models that encompass both white- and black-box access and two attack goals: fingerprint removal, which erases identifying traces to evade attribution, and fingerprint forgery, which seeks to cause misattribution to a target model. We implement five attack strategies and evaluate 14 representative fingerprinting methods across RGB, frequency, and learned-feature domains on 12 state-of-the-art image generators. Our experiments reveal a pronounced gap between clean and adversarial performance. Removal attacks are highly effective, often achieving success rates above 80% in white-box settings and over 50% under constrained black-box access. While forgery is more challenging than removal, its success significantly varies across targeted models. We also identify a utility-robustness trade-off: methods with the highest attribution accuracy are often vulnerable to attacks. Although some techniques exhibit robustness in specific settings, none achieves high robustness and accuracy across all evaluated threat models. These findings highlight the need for techniques balancing robustness and accuracy, and identify the most promising approaches for advancing this goal.

Problem

Research questions and friction points this paper is trying to address.

Evaluates robustness of AI image fingerprint detection under adversarial attacks.

Tests fingerprint removal and forgery attacks across various threat models.

Identifies trade-off between attribution accuracy and attack vulnerability.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic security evaluation of AI image fingerprinting techniques

Implementation of five attack strategies across multiple domains

Identification of utility-robustness trade-off in attribution methods

🔎 Similar Papers

Fake It Until You Break It: On the Adversarial Robustness of AI-generated Image Detectors