🤖 AI Summary
This work systematically investigates the expressivity boundaries of autoregressive language models (LMs) in inducing arbitrary next-token probability distributions via prompt engineering.
Method: We propose a unified framework for soft and hard gradient-based prompt optimization, minimizing KL divergence to search for prompts that best approximate target distributions.
Contribution/Results: We establish that both distributional entropy and the presence of “outlier tokens” jointly determine inducibility: low- and high-entropy distributions are more readily approximated; medium-entropy distributions containing outliers significantly outperform uniform ones; and the model’s native distribution exhibits minimal reconstruction error. Cross-tokenizer experiments show that distributions generated by same-architecture LMs achieve, on average, 42% lower KL error than random baselines—demonstrating structured, architecture-dependent constraints on expressivity. Our findings provide both theoretical foundations and empirical evidence for understanding the probabilistic modeling capacity of LMs.
📝 Abstract
Autoregressive neural language models (LMs) generate a probability distribution over tokens at each time step given a prompt. In this work, we attempt to systematically understand the probability distributions that LMs can produce, showing that some distributions are significantly harder to elicit than others. Specifically, for any target next-token distribution over the vocabulary, we attempt to find a prompt that induces the LM to output a distribution as close as possible to the target, using either soft or hard gradient-based prompt tuning. We find that (1) in general, distributions with very low or very high entropy are easier to approximate than those with moderate entropy; (2) among distributions with the same entropy, those containing ''outlier tokens'' are easier to approximate; (3) target distributions generated by LMs -- even LMs with different tokenizers -- are easier to approximate than randomly chosen targets. These results offer insights into the expressiveness of LMs and the challenges of using them as probability distribution proposers.