GTSD: Generative Text Steganography Based on Diffusion Model

📅 2025-04-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing autoregressive text steganography methods suffer from slow generation, poor imperceptibility, weak robustness, and vulnerability to substitution attacks. This paper pioneers the integration of diffusion models into text steganography, proposing GTSD—a Generative Text Steganography via Diffusion. GTSD employs prompt mapping for semantic-guided conditional generation and batch mapping to enable dynamic manuscript selection and parallel sampling, jointly optimizing steganographic capacity, efficiency, and security. Evaluated on standard benchmarks, GTSD achieves low detectability (steganalysis F1-score = 0.21), accelerates generation by 3.2× over baselines, reduces substitution-attack success rate to 18.7%, improves BLEU-4 by 12.6%, and scales steganographic capacity linearly with prompt length and batch size. Comprehensive experiments demonstrate that GTSD outperforms all state-of-the-art methods across key metrics.

Technology Category

Application Category

📝 Abstract
With the rapid development of deep learning, existing generative text steganography methods based on autoregressive models have achieved success. However, these autoregressive steganography approaches have certain limitations. Firstly, existing methods require encoding candidate words according to their output probability and generating each stego word one by one, which makes the generation process time-consuming. Secondly, encoding and selecting candidate words changes the sampling probabilities, resulting in poor imperceptibility of the stego text. Thirdly, existing methods have low robustness and cannot resist replacement attacks. To address these issues, we propose a generative text steganography method based on a diffusion model (GTSD), which improves generative speed, robustness, and imperceptibility while maintaining security. To be specific, a novel steganography scheme based on diffusion model is proposed to embed secret information through prompt mapping and batch mapping. The prompt mapping maps secret information into a conditional prompt to guide the pre-trained diffusion model generating batches of candidate sentences. The batch mapping selects stego text based on secret information from batches of candidate sentences. Extensive experiments show that the GTSD outperforms the SOTA method in terms of generative speed, robustness, and imperceptibility while maintaining comparable anti-steganalysis performance. Moreover, we verify that the GTSD has strong potential: embedding capacity is positively correlated with prompt capacity and model batch sizes while maintaining security.
Problem

Research questions and friction points this paper is trying to address.

Slow generation speed in autoregressive text steganography methods
Poor imperceptibility due to altered sampling probabilities
Low robustness against replacement attacks in existing approaches
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses diffusion model for text steganography
Embeds secrets via prompt and batch mapping
Improves speed, robustness, and imperceptibility
🔎 Similar Papers
No similar papers found.
Zhengxian Wu
Zhengxian Wu
Tsinghua University
Computer Vision、Large Language Model
J
Juan Wen
Collage of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
Yiming Xue
Yiming Xue
CAU
data hidingsignal processing
Z
Ziwei Zhang
Collage of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
Yinghan Zhou
Yinghan Zhou
China Agricultral University