STEAD: Robust Provably Secure Linguistic Steganography with Diffusion Language Model

📅 2026-01-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the vulnerability of existing provably secure linguistic steganography methods based on autoregressive language models, which often fail under token-level active attacks—such as insertion, deletion, or substitution—due to error propagation. To overcome this limitation, the paper introduces diffusion language models into linguistic steganography for the first time, proposing a partially parallel generation mechanism combined with a robust embedding position selection strategy. Furthermore, it integrates pseudorandom error-correcting codes with a neighborhood search decoding algorithm to construct a steganographic system that simultaneously guarantees information-theoretic security and resilience against token-level tampering. Theoretical analysis and empirical results demonstrate that the proposed approach effectively mitigates tokenization ambiguity while significantly enhancing robustness against such attacks without compromising provable security.

Technology Category

Application Category

📝 Abstract
Recent provably secure linguistic steganography (PSLS) methods rely on mainstream autoregressive language models (ARMs) to address historically challenging tasks, that is, to disguise covert communication as ``innocuous''natural language communication. However, due to the characteristic of sequential generation of ARMs, the stegotext generated by ARM-based PSLS methods will produce serious error propagation once it changes, making existing methods unavailable under an active tampering attack. To address this, we propose a robust, provably secure linguistic steganography with diffusion language models (DLMs). Unlike ARMs, DLMs can generate text in a partially parallel manner, allowing us to find robust positions for steganographic embedding that can be combined with error-correcting codes. Furthermore, we introduce error correction strategies, including pseudo-random error correction and neighborhood search correction, during steganographic extraction. Theoretical proof and experimental results demonstrate that our method is secure and robust. It can resist token ambiguity in stegotext segmentation and, to some extent, withstand token-level attacks of insertion, deletion, and substitution.
Problem

Research questions and friction points this paper is trying to address.

linguistic steganography
provably secure
diffusion language model
error propagation
tampering attack
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion language model
provably secure steganography
error correction
robust steganography
linguistic steganography
🔎 Similar Papers
No similar papers found.
Yuang Qi
Yuang Qi
University of Science and Technology of China
information hidinginformation privacyAI security
Na Zhao
Na Zhao
Singapore University of Technology and Design
Computer VisionMachine LearningScene Understanding3D PerceptionMultimedia
Qiyi Yao
Qiyi Yao
Ph.D. Candidate, University of Science & Technology of China
SteganographyCoding Theory
B
Benlong Wu
University of Science and Technology of China, Anhui Province Key Laboratory of Digital Security
W
Weiming Zhang
University of Science and Technology of China, Anhui Province Key Laboratory of Digital Security
N
Neng H. Yu
University of Science and Technology of China, Anhui Province Key Laboratory of Digital Security
Kejiang Chen
Kejiang Chen
Department of Electronic Engineering and Information Science, University of Science and Technology
information hiding,steganography,privacy-preserving