đ¤ AI Summary
Existing approaches to natural language generation under strict hard constraintsâsuch as the RADNER rulesâexhibit significant limitations in both constraint expressivity and adherence.
Method: This paper introduces a âconstraint-firstâ framework that systematically integrates constraint programming (CP) into NLP text generation for the first time. It formalizes generation as a discrete combinatorial optimization problem, jointly encoding linguistic features (e.g., n-grams, syllables, character counts) and declarative constraints. A large language model (LLM) then ranks candidate outputs via perplexity-based scoring to identify the optimal solution. Crucially, the method requires no fine-tuning or prompt engineering.
Contribution/Results: The framework achieves fully automatic compliance with extremely stringent syntactic and structural constraints. Empirical evaluation in clinical and vision-science domains demonstrates robust generation of large-scale, constraint-satisfying sentencesâeven under âunreasonably strongâ constraintsâestablishing a novel paradigm for hard-constraint text generation.
đ Abstract
Constrained text generation remains a challenging task, particularly when dealing with hard constraints. Traditional Natural Language Processing (NLP) approaches prioritize generating meaningful and coherent output. Also, the current state-of-the-art methods often lack the expressiveness and constraint satisfaction capabilities to handle such tasks effectively. This paper presents the Constraints First Framework to remedy this issue.
This framework considers a constrained text generation problem as a discrete combinatorial optimization problem. It is solved by a constraint programming method that combines linguistic properties (e.g., n-grams or language level) with other more classical constraints (e.g., the number of characters, syllables, or words). Eventually, a curation phase allows for selecting the best-generated sentences according to perplexity using a large language model.
The effectiveness of this approach is demonstrated by tackling a new more tediously constrained text generation problem: the iconic RADNER sentences problem. This problem aims to generate sentences respecting a set of quite strict rules defined by their use in vision and clinical research. Thanks to our CP-based approach, many new strongly constrained sentences have been successfully generated in an automatic manner. This highlights the potential of our approach to handle unreasonably constrained text generation scenarios.