FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition

📅 2024-09-13

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

238K/year

🤖 AI Summary

To address key bottlenecks in large language model (LLM) intellectual property protection—including high fingerprint migration overhead, post-embedding instability, and interference with downstream adaptation—this paper proposes a lightweight, scalable vector-based fingerprint embedding method. Our approach introduces a novel reusable fingerprint vector mechanism: a single fingerprint vector, generated once, is injected into any derivative LLM’s weights via CPU-efficient additive perturbation, eliminating the need for fine-tuning. The method ensures both provable unremovability and zero functional degradation—achieving <0.3% inference latency increase while fully preserving task performance. Extensive evaluation across multiple LLMs demonstrates >99.5% fingerprint identification accuracy and negligible memory overhead. To our knowledge, this is the first method enabling *proactive*, *stable*, and *non-intrusive* large-scale fingerprint deployment for LLMs.

Technology Category

Application Category

📝 Abstract

Training Large Language Models (LLMs) requires immense computational power and vast amounts of data. As a result, protecting the intellectual property of these models through fingerprinting is essential for ownership authentication. While adding fingerprints to LLMs through fine-tuning has been attempted, it remains costly and unscalable. In this paper, we introduce FP-VEC, a pilot study on using fingerprint vectors as an efficient fingerprinting method for LLMs. Our approach generates a fingerprint vector that represents a confidential signature embedded in the model, allowing the same fingerprint to be seamlessly incorporated into an unlimited number of LLMs via vector addition. Results on several LLMs show that FP-VEC is lightweight by running on CPU-only devices for fingerprinting, scalable with a single training and unlimited fingerprinting process, and preserves the model's normal behavior. The project page is available at https://fingerprintvector.github.io .

Problem

Research questions and friction points this paper is trying to address.

Reducing computational overhead in fingerprinting multiple downstream models

Addressing fingerprint instability during model fine-tuning

Enabling scalable post-hoc fingerprint transfer via vector addition

Innovation

Methods, ideas, or system contributions that make the work stand out.

Extracts fingerprint vector from base model

Transfers fingerprint via parameter addition

Enables scalable post-hoc fingerprint transfer

🔎 Similar Papers

A Fingerprint for Large Language Models