🤖 AI Summary
Large language models (LLMs) face risks of unauthorized redistribution and commercial use; existing watermarking methods struggle to simultaneously achieve concealment, robustness, and functional preservation—internal methods require full-parameter access, while backdoor-based approaches are easily detectable. This paper proposes FPEdit, a lightweight fingerprint embedding framework based on localized knowledge editing, which embeds ownership identifiers via sparse weight modifications—ensuring concealment and semantic naturalness without anomalous triggers or full-parameter access. FPEdit is the first method to achieve comprehensive robustness against 24 diverse attacks, including full-parameter fine-tuning, quantization, pruning, and stochastic decoding. It embeds 10 distinct fingerprints into LLaMA2-7B within 10 minutes using <32 GB GPU memory, achieving 95–100% fingerprint retention and zero degradation across 24 downstream tasks. Moreover, it reduces computational overhead by 70% compared to prior approaches.
📝 Abstract
Large language models represent significant investments in computation, data, and engineering expertise, making them extraordinarily valuable intellectual assets. Nevertheless, these AI assets remain vulnerable to unauthorized redistribution and commercial exploitation through fine-tuning or black-box deployment. Current fingerprinting approaches face a fundamental trade-off: intrinsic methods require full parameter access, while backdoor-based techniques employ statistically anomalous triggers easily detected and filtered by adversaries. To address these limitations, we introduce FPEdit, a novel knowledge-editing framework that injects semantically coherent natural language fingerprints by modifying a sparse subset of model weights. This ensures stealthy and precise ownership encoding without degrading the core functionality. Extensive experiments show that FPEdit achieves $95$-$100%$ fingerprint retention under both full-parameter fine-tuning and parameter-efficient adaptation, while preserving performance on 24 downstream benchmarks. Moreover, FPEdit remains robust under quantization, pruning, and stochastic decoding, and can embed 10 fingerprint pairs into LLaMA2-7B in under 10 minutes using less than 32 GB of GPU memory, a $70%$ reduction in resource requirements compared to existing techniques. These advances establish FPEdit as the first fingerprinting approach to simultaneously achieve robustness against adaptation, resistance to detection, and preservation of model utility, providing a minimally invasive solution for reliable provenance verification of large language models in adversarial deployment scenarios.