🤖 AI Summary
This work addresses LLM-driven linguistic steganography, aiming to enhance both the stealthiness and embedding efficiency of stego-texts—minimizing token consumption while preserving semantic naturalness, fluency, and detection resistance. We formulate sequential steganography as a Constrained Markov Decision Process (CMDP), introducing, for the first time, discounted cumulative Total Variation (TV) as a distributional fidelity constraint. We theoretically prove the existence of a unique deterministic optimal policy with a water-filling structure, necessitating global joint decision-making. By leveraging convex optimization for dimensionality reduction and closed-form solutions, we achieve precise, fine-grained control over the LLM’s output probability distribution. Experiments demonstrate that our method significantly reduces token overhead while substantially improving textual naturalness and robustness against steganalysis detectors.
📝 Abstract
Linguistic steganography aims to conceal information within natural language text without being detected. An effective steganography approach should encode the secret message into a minimal number of language tokens while preserving the natural appearance and fluidity of the stego-texts. We present a new framework to enhance the embedding efficiency of stego-texts generated by modifying the output of a large language model (LLM). The novelty of our approach is in abstracting the sequential steganographic embedding process as a Constrained Markov Decision Process (CMDP), which takes into consideration the long-term dependencies instead of merely the immediate effects. We constrain the solution space such that the discounted accumulative total variation divergence between the selected probability distribution and the original distribution given by the LLM is below a threshold. To find the optimal policy, we first show that the functional optimization problem can be simplified to a convex optimization problem with a finite number of variables. A closed-form solution for the optimal policy is then presented to this equivalent problem. It is remarkable that the optimal policy is deterministic and resembles water-filling in some cases. The solution suggests that usually adjusting the probability distribution for the state that has the least random transition probability should be prioritized, but the choice should be made by taking into account the transition probabilities at all states instead of only the current state.