AuthPrint: Fingerprinting Generative Models Against Malicious Model Providers

📅 2025-08-06

📈 Citations: 0

✨ Influential: 0

career value

214K/year

🤖 AI Summary

To address the challenge of unverifiable provenance of generative model outputs in high-risk scenarios—where model providers may actively evade attribution—this paper proposes a robust fingerprinting authentication method designed specifically for adversarial providers. We pioneer the extension of model fingerprinting to an adversarial threat model, wherein a trusted verifier implicitly embeds fingerprints in the output space and introduces dedicated detection mechanisms tailored to the distinct statistical characteristics of GANs and diffusion models. The method demonstrates strong robustness against architectural modifications, data-based fine-tuning, and output perturbations, achieving a false positive rate (FPR) near zero at 95% true positive rate (TPR). It effectively withstands active evasion attacks, including those involving intentional fingerprint removal or obfuscation. The implementation is publicly available.

Technology Category

Application Category

📝 Abstract

Generative models are increasingly adopted in high-stakes domains, yet current deployments offer no mechanisms to verify the origin of model outputs. We address this gap by extending model fingerprinting techniques beyond the traditional collaborative setting to one where the model provider may act adversarially. To our knowledge, this is the first work to evaluate fingerprinting for provenance attribution under such a threat model. The methods rely on a trusted verifier that extracts secret fingerprints from the model's output space, unknown to the provider, and trains a model to predict and verify them. Our empirical evaluation shows that our methods achieve near-zero FPR@95%TPR for instances of GAN and diffusion models, even when tested on small modifications to the original architecture and training data. Moreover, the methods remain robust against adversarial attacks that actively modify the outputs to bypass detection. Source codes are available at https://github.com/PSMLab/authprint.

Problem

Research questions and friction points this paper is trying to address.

Verifying generative model outputs' origin against adversarial providers

Extending fingerprinting techniques to malicious model provider scenarios

Ensuring robust provenance attribution under active evasion attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts secret fingerprints from model outputs

Trains verifier model to predict fingerprints

Robust against adversarial output modifications

🔎 Similar Papers

Model Synthesis for Zero-Shot Model Attribution