Every Bit, Everywhere, All at Once: A Binomial Multibit LLM Watermark

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work addresses the challenge of efficiently embedding multi-bit information—such as user IDs or timestamps—into large language model outputs under low-distortion constraints, a task inadequately handled by existing watermarking techniques. The authors propose a novel multi-bit watermarking method based on binomial coding that, for the first time, enables synchronous per-bit encoding at every generation position. A stateful encoder dynamically allocates encoding pressure to enhance load balancing and robustness. Integrated with low-perturbation text generation control and a per-bit confidence evaluation mechanism, the approach significantly outperforms eight baseline methods: even at high payloads of up to 64 bits, message accuracy and robustness consistently improve as payload increases and distortion decreases.

📝 Abstract

With LLM watermarking already being deployed commercially, practical applications increasingly require multibit watermarks that encode more complex payloads, such as user IDs or timestamps, into the generated text. In this work, we propose a fundamentally new approach for multibit watermarking: introducing binomial encoding to directly encode every bit of the payload at every token position. We complement our approach with a stateful encoder that during generation dynamically redirects encoding pressure toward underencoded bits. Our evaluation against 8 baselines on up to 64-bit payloads shows that our scheme achieves superior message accuracy and robustness, with the gap to baseline methods widening in more relevant settings (i.e., large payloads and low-distortion regimes). At the same time, we challenge prior works' evaluation metrics, highlighting their lack of practical insights, and introduce per-bit confidence scoring as a practically relevant metric for evaluating multibit LLM watermarks.

Problem

Research questions and friction points this paper is trying to address.

LLM watermarking

multibit watermark

payload encoding

robustness

message accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

binomial encoding

multibit watermarking

stateful encoder