EditMark: Watermarking Large Language Models based on Model Editing

📅 2025-10-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of embedding watermarks in large language models (LLMs) that simultaneously ensure copyright protection, zero performance degradation, and output naturalness, this paper proposes EditMark—the first lightweight watermarking method based on model editing. Unlike existing approaches requiring full retraining or fine-tuning, EditMark directly embeds verifiable watermarks into the model’s parameter space via adaptive multi-round weight editing and noise matrix injection—without any fine-tuning and without compromising generation quality. Its key contributions are: (1) the first application of model editing for LLM watermarking; (2) a unique watermark-response mechanism tailored for multi-answer tasks; and (3) efficient embedding of a 32-bit watermark in under 20 seconds—340× faster than fine-tuning—with 100% extraction accuracy, high fidelity, strong imperceptibility, and robustness against common adversarial attacks.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities, but their training requires extensive data and computational resources, rendering them valuable digital assets. Therefore, it is essential to watermark LLMs to protect their copyright and trace unauthorized use or resale. Existing methods for watermarking LLMs primarily rely on training LLMs with a watermarked dataset, which entails burdensome training costs and negatively impacts the LLM's performance. In addition, their watermarked texts are not logical or natural, thereby reducing the stealthiness of the watermark. To address these issues, we propose EditMark, the first watermarking method that leverages model editing to embed a training-free, stealthy, and performance-lossless watermark for LLMs. We observe that some questions have multiple correct answers. Therefore, we assign each answer a unique watermark and update the weights of LLMs to generate corresponding questions and answers through the model editing technique. In addition, we refine the model editing technique to align with the requirements of watermark embedding. Specifically, we introduce an adaptive multi-round stable editing strategy, coupled with the injection of a noise matrix, to improve both the effectiveness and robustness of the watermark embedding. Extensive experiments indicate that EditMark can embed 32-bit watermarks into LLMs within 20 seconds (Fine-tuning: 6875 seconds) with a watermark extraction success rate of 100%, which demonstrates its effectiveness and efficiency. External experiments further demonstrate that EditMark has fidelity, stealthiness, and a certain degree of robustness against common attacks.
Problem

Research questions and friction points this paper is trying to address.

Protects LLM copyright without performance loss through model editing
Creates stealthy watermarks using multiple correct answer variations
Embeds robust watermarks rapidly without expensive retraining costs
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses model editing for training-free watermark embedding
Employs adaptive multi-round stable editing strategy
Injects noise matrix to enhance watermark robustness
🔎 Similar Papers
No similar papers found.
S
Shuai Li
School of Cyber Science and Security, University of Science and Technology of China, Hefei, Anhui 230026, China
Kejiang Chen
Kejiang Chen
Department of Electronic Engineering and Information Science, University of Science and Technology
information hiding,steganography,privacy-preserving
J
Jun Jiang
School of Cyber Science and Security, University of Science and Technology of China, Hefei, Anhui 230026, China
J
Jie Zhang
Centre for Frontier AI Research, Agency for Science, Technology and Research (CFAR and IHPC, A*STAR), Singapore
Qiyi Yao
Qiyi Yao
Ph.D. Candidate, University of Science & Technology of China
SteganographyCoding Theory
K
Kai Zeng
Department of Information Engineering and Mathematics, University of Siena, Siena, Italy
W
Weiming Zhang
School of Cyber Science and Security, University of Science and Technology of China, Hefei, Anhui 230026, China
Nenghai Yu
Nenghai Yu
University of Science and Technology of China
Computer VisionArtificial IntelligenceInformation Hiding