🤖 AI Summary
To address the challenges of poor syntactic control and low precision in large language models (LLMs), this paper proposes a fine-grained, tuning-free syntactic generation method. It innovatively integrates posterior inference with sequential Monte Carlo (SMC) sampling, introduces a dynamic constituency parser as a real-time guidance module, and designs a syntax-aware proposal distribution tailored to constituent-tree constraints. Evaluated on GPT2-large and Llama3-8B, the method achieves syntactic F1 scores of 93.0—improving by 80.7 and 57.7 points over baselines—while preserving text fluency and semantic coherence. The core contribution lies in embedding rigorous constituent-tree constraints directly into the autoregressive generation process, thereby enabling high-precision syntactic control without compromising model generality or requiring architectural modification or parameter updates.
📝 Abstract
Controlling the syntactic structure of text generated by language models is valuable for applications requiring clarity, stylistic consistency, or interpretability, yet it remains a challenging task. In this paper, we argue that sampling algorithms based on the posterior inference can effectively enforce a target constituency structure during generation. Our approach combines sequential Monte Carlo, which estimates the posterior distribution by sampling from a proposal distribution, with a syntactic tagger that ensures that each generated token aligns with the desired syntactic structure. Our experiments with GPT2 and Llama3-8B models show that with an appropriate proposal distribution, we can improve syntactic accuracy, increasing the F1 score from $12.31$ (GPT2-large) and $35.33$ (Llama3-8B) to about $93$ in both cases without compromising the language model's fluency. These results underscore both the complexity of syntactic control and the effectiveness of sampling algorithms, offering a promising approach for applications where precise control over syntax is essential.