🤖 AI Summary
This work addresses the inherent trade-off between factual accuracy and informativeness in large language model (LLM) text generation, where different applications impose varying demands on these qualities. To this end, the authors propose a Factuality-Controlled Generation (FCG) framework that, for the first time, enables users to explicitly specify factuality constraints through natural language queries. The approach leverages synthetically generated data to train models capable of dynamically balancing factual correctness with informational richness. The framework introduces a tunable mechanism for factuality control and establishes a comprehensive evaluation protocol that jointly assesses constraint adherence and output informativeness. Experimental results demonstrate that models trained with the proposed synthetic data significantly enhance their ability to satisfy user-defined factuality requirements while preserving high levels of information content.
📝 Abstract
Large language models (LLMs) encode knowledge with varying degrees of confidence. When responding to queries, models face an inherent trade-off: they can generate responses that are less informative but highly factual, or more informative but potentially less accurate. Different applications demand different balances between informativeness and factuality. We introduce Factuality-Controlled Generation (FCG), a framework that enables users to specify factuality constraints alongside their queries. We propose to evaluate FCG performance on two dimensions: adherence to factuality constraints and response informativeness. We propose to train models on the FCG task using synthetic data, and show that our synthetic training significantly improves models'ability to both respect factuality requirements and maintain informativeness in their outputs.