Progress Ratio Embeddings: An Impatience Signal for Robust Length Control in Neural Text Generation

📅 2025-12-07

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

Neural language models struggle to precisely control output length during text generation, particularly when targeting lengths outside the training distribution; existing discrete countdown mechanisms based on reverse position embeddings (RPE) suffer from instability in such out-of-distribution settings. To address this, we propose Progress Ratio Embedding (PRE), a continuous, differentiable mechanism that replaces discrete countdown with a “restlessness signal” modeled via trigonometric functions. PRE normalizes the remaining token count into a continuous progress ratio and injects it as an auxiliary embedding into the Transformer architecture, enabling end-to-end training and robust generalization to unseen lengths. Crucially, PRE requires no modification to decoding logic and integrates seamlessly into standard autoregressive frameworks. On two news summarization benchmarks, PRE significantly improves length fidelity—reducing ±5-token length error by 37%—while preserving ROUGE scores and generated text quality, with no performance trade-off.

Technology Category

Application Category

📝 Abstract

Modern neural language models achieve high accuracy in text generation, yet precise control over generation length remains underdeveloped. In this paper, we first investigate a recent length control method based on Reverse Positional Embeddings (RPE) and show its limits when control is requested beyond the training distribution. In particular, using a discrete countdown signal tied to the absolute remaining token count leads to instability. To provide robust length control, we introduce Progress Ratio Embeddings (PRE), as continuous embeddings tied to a trigonometric impatience signal. PRE integrates seamlessly into standard Transformer architectures, providing stable length fidelity without degrading text accuracy under standard evaluation metrics. We further show that PRE generalizes well to unseen target lengths. Experiments on two widely used news-summarization benchmarks validate these findings.

Problem

Research questions and friction points this paper is trying to address.

Develop robust length control for neural text generation

Address instability in existing discrete countdown methods

Generalize length control to unseen target lengths

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continuous embeddings tied to trigonometric impatience signal

Seamless integration into standard Transformer architectures

Generalizes well to unseen target lengths

🔎 Similar Papers

No similar papers found.