🤖 AI Summary
Large language models (LLMs) struggle to strictly adhere to prescribed output length constraints, limiting their practical deployment across diverse applications. To address this, we propose a length-controllable text generation framework based on data augmentation and supervised fine-tuning. Our method integrates length-conditioned prompting, self-generated training data construction, and multi-objective collaborative optimization—enabling, for the first time, simultaneous improvement in length compliance and response quality. Experiments demonstrate near-perfect length constraint satisfaction (≈100%), while preserving—and often surpassing—the linguistic quality of baseline models; notably, self-feedback-driven data yields optimal generation stability. Crucially, our approach achieves high-precision length control without reinforcement learning, ensuring semantic coherence and fluency throughout generated outputs.
📝 Abstract
LLMs are not generally able to adjust the length of their outputs based on strict length requirements, a capability that would improve their usefulness in applications that require adherence to diverse user and system requirements. We present an approach to train LLMs to acquire this capability by augmenting existing data and applying existing fine-tuning techniques, which we compare based on the trained models' adherence to the length requirement and overall response quality relative to the baseline model. Our results demonstrate that these techniques can be successfully applied to train LLMs to adhere to length requirements, with the trained models generating texts which better align to the length requirements. Our results indicate that our method may change the response quality when using training data that was not generated by the baseline model. This allows simultaneous alignment to another training objective in certain scenarios, but is undesirable otherwise. Training on a dataset containing the model's own responses eliminates this issue.