DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection

📅 2026-01-13
📈 Citations: 2
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of effectively protecting intellectual property in black-box deployments of large language models, where existing watermarking approaches are vulnerable to filtering, leakage, and adaptive attacks. The authors propose a novel dual-layer nested fingerprinting technique that integrates domain-specific stylistic cues with implicit semantic triggers to construct a hierarchical backdoor mechanism. This approach enables highly stealthy fingerprint activation with low perplexity—without relying on rare tokens—while preserving model utility. Evaluated on Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, the method achieves 100% activation rates, maintains downstream task performance, and demonstrates strong robustness against fine-tuning, model merging, and detection-based attacks, significantly enhancing the practicality and security of model ownership verification in black-box settings.

Technology Category

Application Category

📝 Abstract
The rapid growth of large language models raises pressing concerns about intellectual property protection under black-box deployment. Existing backdoor-based fingerprints either rely on rare tokens -- leading to high-perplexity inputs susceptible to filtering -- or use fixed trigger-response mappings that are brittle to leakage and post-hoc adaptation. We propose \textsc{Dual-Layer Nested Fingerprinting} (DNF), a black-box method that embeds a hierarchical backdoor by coupling domain-specific stylistic cues with implicit semantic triggers. Across Mistral-7B, LLaMA-3-8B-Instruct, and Falcon3-7B-Instruct, DNF achieves perfect fingerprint activation while preserving downstream utility. Compared with existing methods, it uses lower-perplexity triggers, remains undetectable under fingerprint detection attacks, and is relatively robust to incremental fine-tuning and model merging. These results position DNF as a practical, stealthy, and resilient solution for LLM ownership verification and intellectual property protection.
Problem

Research questions and friction points this paper is trying to address.

intellectual property protection
large language models
black-box fingerprinting
backdoor-based watermarking
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-Layer Nested Fingerprinting
black-box watermarking
large language model
intellectual property protection
stealthy backdoor