KinGuard: Hierarchical Kinship-Aware Fingerprinting to Defend Against Large Language Model Stealing

📅 2026-01-19

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the inherent tension between stealth and robustness in existing backdoor-based fingerprinting methods for large language models, which often rely on high-perplexity triggers that induce detectable statistical anomalies. To overcome this limitation, the authors propose a novel knowledge-embedding watermarking mechanism that eschews explicit surface-level triggers in favor of semantic memory. Specifically, they construct a private knowledge base comprising structured kinship narratives, internalize this knowledge into the model via incremental pretraining, and verify ownership using concept-understanding probes. By embedding watermarks as implicit semantic associations rather than overt backdoors, the method achieves exceptional robustness, stealth, and efficacy under diverse adversarial conditions—including fine-tuning, input perturbations, and model merging—thereby significantly enhancing the security and practicality of model copyright protection.

Technology Category

Application Category

📝 Abstract

Protecting the intellectual property of large language models requires robust ownership verification. Conventional backdoor fingerprinting, however, is flawed by a stealth-robustness paradox: to be robust, these methods force models to memorize fixed responses to high-perplexity triggers, but this targeted overfitting creates detectable statistical artifacts. We resolve this paradox with KinGuard, a framework that embeds a private knowledge corpus built on structured kinship narratives. Instead of memorizing superficial triggers, the model internalizes this knowledge via incremental pre-training, and ownership is verified by probing its conceptual understanding. Extensive experiments demonstrate KinGuard's superior effectiveness, stealth, and resilience against a battery of attacks including fine-tuning, input perturbation, and model merging. Our work establishes knowledge-based embedding as a practical and secure paradigm for model fingerprinting.

Problem

Research questions and friction points this paper is trying to address.

model stealing

fingerprinting

intellectual property protection

stealth-robustness paradox

large language models

Innovation

Methods, ideas, or system contributions that make the work stand out.

knowledge-based fingerprinting

kinship-aware embedding

incremental pre-training