Improving Constrained Generation in Language Models via Self-Distilled Twisted Sequential Monte Carlo

📅 2025-07-03

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

In constrained text generation, severe mismatch between the target distribution and the prior of pretrained autoregressive language models impedes learning under sparse rewards. To address this, we propose a probabilistic inference framework based on twisted sequential Monte Carlo (Twisted SMC). Our core innovations are: (i) a learnable twisting function that adaptively reshapes the proposal distribution, and (ii) a self-distillation mechanism—leveraging self-critical sampling to generate high-quality pseudo-labels for iterative refinement toward the target distribution. Crucially, our approach reframes constrained generation as implicit probabilistic inference, requiring neither reinforcement learning nor external supervision. Extensive experiments across diverse constrained generation tasks—including keyword-controlled and attribute-controllable generation—demonstrate substantial improvements over state-of-the-art baselines. Results validate that our method effectively mitigates prior–target distribution mismatch, enhances sampling efficiency, and improves generation quality.

Technology Category

Application Category

📝 Abstract

Recent work has framed constrained text generation with autoregressive language models as a probabilistic inference problem. Among these, Zhao et al. (2024) introduced a promising approach based on twisted Sequential Monte Carlo, which incorporates learned twist functions and twist-induced proposals to guide the generation process. However, in constrained generation settings where the target distribution concentrates on outputs that are unlikely under the base model, learning becomes challenging due to sparse and uninformative reward signals. We show that iteratively refining the base model through self-distillation alleviates this issue by making the model progressively more aligned with the target, leading to substantial gains in generation quality.

Problem

Research questions and friction points this paper is trying to address.

Improving constrained text generation in autoregressive language models

Addressing sparse rewards in learning target-aligned distributions

Enhancing generation quality via self-distilled iterative refinement

Innovation

Methods, ideas, or system contributions that make the work stand out.

Twisted Sequential Monte Carlo method

Self-distillation for model refinement

Learned twist functions guide generation

🔎 Similar Papers

No similar papers found.