MirrorMark: A Distortion-Free Multi-Bit Watermark for Large Language Models

📅 2026-01-29
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing watermarking methods for large language models struggle to simultaneously achieve multi-bit embedding, high text quality, and robust detection. This work proposes a distortion-free multi-bit watermarking mechanism that embeds watermarks via measure-preserving random mirroring without altering the token distribution. To enhance robustness against insertion and deletion attacks, a context-aware scheduler is designed to evenly allocate watermark bits across the generated text. The proposed method uniquely unifies high-quality generation with strong detectability, embedding 54 bits of watermark within 300 tokens while preserving text quality comparable to original generation. It improves detection accuracy by 8–12% and achieves up to an 11% gain in identification rate at a 1% false positive rate. Furthermore, the study provides a theoretical analysis framework based on the equal error rate.

Technology Category

Application Category

📝 Abstract
As large language models (LLMs) become integral to applications such as question answering and content creation, reliable content attribution has become increasingly important. Watermarking is a promising approach, but existing methods either provide only binary signals or distort the sampling distribution, degrading text quality; distortion-free approaches, in turn, often suffer from weak detectability or robustness. We propose MirrorMark, a multi-bit and distortion-free watermark for LLMs. By mirroring sampling randomness in a measure-preserving manner, MirrorMark embeds multi-bit messages without altering the token probability distribution, preserving text quality by design. To improve robustness, we introduce a context-based scheduler that balances token assignments across message positions while remaining resilient to insertions and deletions. We further provide a theoretical analysis of the equal error rate to interpret empirical performance. Experiments show that MirrorMark matches the text quality of non-watermarked generation while achieving substantially stronger detectability: with 54 bits embedded in 300 tokens, it improves bit accuracy by 8-12% and correctly identifies up to 11% more watermarked texts at 1% false positive rate.
Problem

Research questions and friction points this paper is trying to address.

watermarking
large language models
distortion-free
multi-bit
content attribution
Innovation

Methods, ideas, or system contributions that make the work stand out.

distortion-free watermarking
multi-bit watermark
measure-preserving sampling
LLM watermarking
context-based scheduler
🔎 Similar Papers
No similar papers found.
Y
Ya Jiang
Department of Computer Science, George Mason University, Fairfax, VA, USA
M
Massieh Kordi Boroujeny
Wireless Cyber Center, College of Engineering and Computing, George Mason University, Fairfax, VA, USA
S
Surender Suresh Kumar
Wireless Cyber Center, College of Engineering and Computing, George Mason University, Fairfax, VA, USA
Kai Zeng
Kai Zeng
Professor of Electrical and Computer Engineering, George Mason University
wireless securityCPS/IoT security and privacyspectrum sharingmachine learning