Riemannian-Geometric Fingerprints of Generative Models

📅 2025-06-28

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

This work addresses the challenges of generative model attribution and synthetic content provenance. We propose a novel “model fingerprinting” paradigm grounded in Riemannian geometry—departing from conventional Euclidean distance metrics, our approach models the latent space of generative models as a non-Euclidean manifold, defining geodesic distance and Riemannian centroid as core fingerprint features, and learning an adaptive Riemannian metric in a data-driven manner. We further introduce a k-Nearest Neighbors Riemannian centroid algorithm to enable fingerprint extraction that generalizes across architectures, modalities, and datasets. Evaluated on four benchmarks, 27 generative models, and both image and text modalities, our method significantly improves model attribution accuracy while demonstrating strong robustness to resolution variations and unseen scenarios. This work establishes an interpretable, generalizable geometric foundation for synthetic data forensics and generative model copyright protection.

Technology Category

Application Category

📝 Abstract

Recent breakthroughs and rapid integration of generative models (GMs) have sparked interest in the problem of model attribution and their fingerprints. For instance, service providers need reliable methods of authenticating their models to protect their IP, while users and law enforcement seek to verify the source of generated content for accountability and trust. In addition, a growing threat of model collapse is arising, as more model-generated data are being fed back into sources (e.g., YouTube) that are often harvested for training ("regurgitative training"), heightening the need to differentiate synthetic from human data. Yet, a gap still exists in understanding generative models' fingerprints, we believe, stemming from the lack of a formal framework that can define, represent, and analyze the fingerprints in a principled way. To address this gap, we take a geometric approach and propose a new definition of artifact and fingerprint of GMs using Riemannian geometry, which allows us to leverage the rich theory of differential geometry. Our new definition generalizes previous work (Song et al., 2024) to non-Euclidean manifolds by learning Riemannian metrics from data and replacing the Euclidean distances and nearest-neighbor search with geodesic distances and kNN-based Riemannian center of mass. We apply our theory to a new gradient-based algorithm for computing the fingerprints in practice. Results show that it is more effective in distinguishing a large array of GMs, spanning across 4 different datasets in 2 different resolutions (64 by 64, 256 by 256), 27 model architectures, and 2 modalities (Vision, Vision-Language). Using our proposed definition significantly improves the performance on model attribution, as well as a generalization to unseen datasets, model types, and modalities, suggesting its practical efficacy.

Problem

Research questions and friction points this paper is trying to address.

Identifying generative models' fingerprints for IP protection.

Differentiating synthetic from human data to prevent model collapse.

Developing a geometric framework to define and analyze model fingerprints.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Riemannian geometry defines GM artifacts and fingerprints

Geodesic distances replace Euclidean in fingerprint analysis

Gradient-based algorithm computes fingerprints effectively

🔎 Similar Papers

No similar papers found.