🤖 AI Summary
To address risks of large language model (LLM) theft and misuse, this paper proposes a verifiable, tamper-resistant model fingerprinting technique. Methodologically, it constructs a cryptographic hash chain using question-answer pairs and SHA-256, enforcing fine-grained response constraints and hash binding to ensure strong integrity verification. It is the first work to formally define and satisfy five core fingerprint properties: transparency, efficiency, persistence, robustness, and unforgeability. Extensive experiments across multiple LLMs demonstrate that the fingerprint withstands benign modifications—including fine-tuning and pruning—as well as adversarial erasure attacks, while preserving near-original model performance post-embedding. This work delivers the first complete solution for LLM copyright protection and provenance tracking that simultaneously achieves theoretical rigor and engineering practicality.
📝 Abstract
Amid growing concerns over the ease of theft and misuse of Large Language Models (LLMs), the need for fingerprinting models has increased. Fingerprinting, in this context, means that the model owner can link a given model to their original version, thereby identifying if their model is being misused or has been completely stolen. In this paper, we first define a set five properties a successful fingerprint should satisfy; namely, the fingerprint should be Transparent, Efficient, Persistent, Robust, and Unforgeable. Next, we propose Chain&Hash, a new, simple fingerprinting approach that implements a fingerprint with a cryptographic flavor, achieving all these properties. Chain&Hash involves generating a set of questions (the fingerprints) along with a set of potential answers. These elements are hashed together using a secure hashing technique to select the value for each question, hence providing an unforgeability property-preventing adversaries from claiming false ownership. We evaluate the Chain&Hash technique on multiple models and demonstrate its robustness against benign transformations, such as fine-tuning on different datasets, and adversarial attempts to erase the fingerprint. Finally, our experiments demonstrate the efficiency of implementing Chain&Hash and its utility, where fingerprinted models achieve almost the same performance as non-fingerprinted ones across different benchmarks.