🤖 AI Summary
This work addresses the limitations of existing multi-bit text watermarking methods for large language models, which often incur high computational overhead for long messages or suffer from low decoding accuracy under constrained generation lengths, while struggling to balance text quality and watermark reliability. The authors propose XMark, a novel framework that employs a minimally perturbing logit distribution encoder to preserve generation quality and pairs it with a tailored, efficient multi-bit decoder to achieve high-fidelity watermark recovery within limited token budgets. By innovatively integrating controlled logit perturbation sampling with autoregressive generation, XMark implicitly embeds and extracts information, significantly improving decoding accuracy in short-text scenarios across diverse tasks without compromising textual fluency, thereby overcoming the longstanding trade-off among capacity, robustness, and generation quality.
📝 Abstract
Multi-bit watermarking has emerged as a promising solution for embedding imperceptible binary messages into Large Language Model (LLM)-generated text, enabling reliable attribution and tracing of malicious usage of LLMs. Despite recent progress, existing methods still face key limitations: some become computationally infeasible for large messages, while others suffer from a poor trade-off between text quality and decoding accuracy. Moreover, the decoding accuracy of existing methods drops significantly when the number of tokens in the generated text is limited, a condition that frequently arises in practical usage. To address these challenges, we propose \textsc{XMark}, a novel method for encoding and decoding binary messages in LLM-generated texts. The unique design of \textsc{XMark}'s encoder produces a less distorted logit distribution for watermarked token generation, preserving text quality, and also enables its tailored decoder to reliably recover the encoded message with limited tokens. Extensive experiments across diverse downstream tasks show that \textsc{XMark} significantly improves decoding accuracy while preserving the quality of watermarked text, outperforming prior methods. The code is at https://github.com/JiiahaoXU/XMark.