A high-capacity linguistic steganography based on entropy-driven rank-token mapping

📅 2025-10-27

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing linguistic steganography methods face fundamental trade-offs between capacity and security: modification-based approaches are vulnerable to detection, retrieval-based strategies suffer from limited capacity, and generative methods are constrained by low token-prediction entropy. This paper proposes RTMStega, a novel framework introducing an entropy-driven rank-token mapping mechanism. It integrates rank-based adaptive encoding, normalized entropy-guided dynamic sampling, and context-aware decompression to突破 the capacity bottleneck while preserving textual naturalness. Unlike prior methods, RTMStega neither relies on explicit text modification nor fixed vocabularies; instead, it leverages the inherent ordinal structure of large language model (LLM) logits distributions to embed high-entropy information. Experiments demonstrate that RTMStega achieves three times the steganographic capacity of state-of-the-art baselines, improves inference speed by over 50%, and maintains superior text quality and robustness against steganalysis across multiple datasets and LLMs.

Technology Category

Application Category

📝 Abstract

Linguistic steganography enables covert communication through embedding secret messages into innocuous texts; however, current methods face critical limitations in payload capacity and security. Traditional modification-based methods introduce detectable anomalies, while retrieval-based strategies suffer from low embedding capacity. Modern generative steganography leverages language models to generate natural stego text but struggles with limited entropy in token predictions, further constraining capacity. To address these issues, we propose an entropy-driven framework called RTMStega that integrates rank-based adaptive coding and context-aware decompression with normalized entropy. By mapping secret messages to token probability ranks and dynamically adjusting sampling via context-aware entropy-based adjustments, RTMStega achieves a balance between payload capacity and imperceptibility. Experiments across diverse datasets and models demonstrate that RTMStega triples the payload capacity of mainstream generative steganography, reduces processing time by over 50%, and maintains high text quality, offering a trustworthy solution for secure and efficient covert communication.

Problem

Research questions and friction points this paper is trying to address.

Enhancing payload capacity in linguistic steganography methods

Reducing detectable anomalies in generated stego text

Overcoming limited entropy constraints in token predictions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses rank-based adaptive coding for token mapping

Implements context-aware entropy-driven decompression

Dynamically adjusts sampling via normalized entropy

🔎 Similar Papers

No similar papers found.

Authors to Follow