🤖 AI Summary
Neural language models struggle to precisely control output length during text generation, particularly when targeting lengths outside the training distribution; existing discrete countdown mechanisms based on reverse position embeddings (RPE) suffer from instability in such out-of-distribution settings. To address this, we propose Progress Ratio Embedding (PRE), a continuous, differentiable mechanism that replaces discrete countdown with a “restlessness signal” modeled via trigonometric functions. PRE normalizes the remaining token count into a continuous progress ratio and injects it as an auxiliary embedding into the Transformer architecture, enabling end-to-end training and robust generalization to unseen lengths. Crucially, PRE requires no modification to decoding logic and integrates seamlessly into standard autoregressive frameworks. On two news summarization benchmarks, PRE significantly improves length fidelity—reducing ±5-token length error by 37%—while preserving ROUGE scores and generated text quality, with no performance trade-off.
📝 Abstract
Modern neural language models achieve high accuracy in text generation, yet precise control over generation length remains underdeveloped. In this paper, we first investigate a recent length control method based on Reverse Positional Embeddings (RPE) and show its limits when control is requested beyond the training distribution. In particular, using a discrete countdown signal tied to the absolute remaining token count leads to instability. To provide robust length control, we introduce Progress Ratio Embeddings (PRE), as continuous embeddings tied to a trigonometric impatience signal. PRE integrates seamlessly into standard Transformer architectures, providing stable length fidelity without degrading text accuracy under standard evaluation metrics. We further show that PRE generalizes well to unseen target lengths. Experiments on two widely used news-summarization benchmarks validate these findings.