🤖 AI Summary
In constrained text generation, severe mismatch between the target distribution and the prior of pretrained autoregressive language models impedes learning under sparse rewards. To address this, we propose a probabilistic inference framework based on twisted sequential Monte Carlo (Twisted SMC). Our core innovations are: (i) a learnable twisting function that adaptively reshapes the proposal distribution, and (ii) a self-distillation mechanism—leveraging self-critical sampling to generate high-quality pseudo-labels for iterative refinement toward the target distribution. Crucially, our approach reframes constrained generation as implicit probabilistic inference, requiring neither reinforcement learning nor external supervision. Extensive experiments across diverse constrained generation tasks—including keyword-controlled and attribute-controllable generation—demonstrate substantial improvements over state-of-the-art baselines. Results validate that our method effectively mitigates prior–target distribution mismatch, enhances sampling efficiency, and improves generation quality.
📝 Abstract
Recent work has framed constrained text generation with autoregressive language models as a probabilistic inference problem. Among these, Zhao et al. (2024) introduced a promising approach based on twisted Sequential Monte Carlo, which incorporates learned twist functions and twist-induced proposals to guide the generation process. However, in constrained generation settings where the target distribution concentrates on outputs that are unlikely under the base model, learning becomes challenging due to sparse and uninformative reward signals. We show that iteratively refining the base model through self-distillation alleviates this issue by making the model progressively more aligned with the target, leading to substantial gains in generation quality.