🤖 AI Summary
This work addresses the challenge of tracing content generated by large language models, where existing watermarking methods suffer from limitations in diversity, robustness, and protection against distillation. The authors propose a theoretically distortion-free, localizable watermarking mechanism based on Gumbel-max sampling. By employing dual keys to restore output diversity and integrating entropy-weighted scoring with multi-region localization, the method enhances detection capability while remaining compatible with efficient inference techniques such as speculative decoding—achieving zero inference overhead. Notably, it is the first scheme to enable watermark transfer after model distillation (“radioactive” property) and to precisely localize AI-generated segments within human-AI hybrid text. Experiments demonstrate significantly stronger detection performance than baselines like SynthID-text and robustness against dilution attacks. Multilingual human evaluations (five languages, 6,000 A/B tests) confirm imperceptible quality degradation and no adverse impact on downstream task performance.
📝 Abstract
We introduce TextSeal, a state-of-the-art watermark for large language models. Building on Gumbel-max sampling, TextSeal introduces dual-key generation to restore output diversity, along with entropy-weighted scoring and multi-region localization for improved detection. It supports serving optimizations such as speculative decoding and multi-token prediction, and does not add any inference overhead. TextSeal strictly dominates baselines like SynthID-text in detection strength and is robust to dilution, maintaining confident localized detection even in heavily mixed human/AI documents. The scheme is theoretically distortion-free, and evaluation across reasoning benchmarks confirms that it preserves downstream performance; while a multilingual human evaluation (6000 A/B comparisons, 5 languages) shows no perceptible quality difference. Beyond its use for provenance detection, TextSeal is also ``radioactive'': its watermark signal transfers through model distillation, enabling detection of unauthorized use.