🤖 AI Summary
Controlled text generation (CTG) suffers from severe control degradation in long-text scenarios, primarily due to the attenuation of prefix attention as sequence length increases—a phenomenon first identified in this work. To address this, we propose Dynamic Token-level Prefix Augmentation (DTPA), a novel framework built upon the Air-Decoding architecture. DTPA introduces an exponentially increasing dynamic scaling factor to reinforce prefix attention token-wise; supports adaptive selection between soft and hard prefixes; and jointly integrates attribute distribution reconstruction with original prompt enhancement to balance controllability and text quality. Extensive experiments demonstrate that DTPA significantly improves attribute alignment accuracy across multiple long-text CTG benchmarks, while preserving fluency, diversity, and topic consistency—particularly excelling in extended sequence generation.
📝 Abstract
Controllable Text Generation (CTG) is a vital subfield in Natural Language Processing (NLP), aiming to generate text that aligns with desired attributes. However, previous studies commonly focus on the quality of controllable text generation for short sequences, while the generation of long-form text remains largely underexplored. In this paper, we observe that the controllability of texts generated by the powerful prefix-based method Air-Decoding tends to decline with increasing sequence length, which we hypothesize primarily arises from the observed decay in attention to the prefixes. Meanwhile, different types of prefixes including soft and hard prefixes are also key factors influencing performance. Building on these insights, we propose a lightweight and effective framework called Dynamic Token-level Prefix Augmentation (DTPA) based on Air-Decoding for controllable text generation. Specifically, it first selects the optimal prefix type for a given task. Then we dynamically amplify the attention to the prefix for the attribute distribution to enhance controllability, with a scaling factor growing exponentially as the sequence length increases. Moreover, based on the task, we optionally apply a similar augmentation to the original prompt for the raw distribution to balance text quality. After attribute distribution reconstruction, the generated text satisfies the attribute constraints well. Experiments on multiple CTG tasks demonstrate that DTPA generally outperforms other methods in attribute control while maintaining competitive fluency, diversity, and topic relevance. Further analysis highlights DTPA's superior effectiveness in long text generation.