Controlled LLM Decoding via Discrete Auto-regressive Biasing

📅 2025-02-06

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

To address the challenge of simultaneously ensuring constraint satisfaction and linguistic fluency in controlled text generation with large language models (LLMs), this paper proposes a gradient-guided controllable decoding framework operating directly in the discrete token space. Methodologically, it introduces discrete autoregressive bias sequence modeling, formalizing a joint distribution over generated and bias sequences; designs a Langevin-within-Gibbs discrete MCMC sampler to avoid distortion from continuous-space approximations; and integrates discrete gradient estimation with energy-based optimization for precise, fine-grained constraint injection—without model fine-tuning or reparameterization. Empirical evaluation on sentiment control, detoxification, and keyword-guided generation shows substantial improvements in constraint adherence while matching or exceeding baseline models in fluency (measured by perplexity and human evaluation), all at lower computational cost.

Technology Category

Application Category

📝 Abstract

Controlled text generation allows for enforcing user-defined constraints on large language model outputs, an increasingly important field as LLMs become more prevalent in everyday life. One common approach uses energy-based decoding, which defines a target distribution through an energy function that combines multiple constraints into a weighted average. However, these methods often struggle to balance fluency with constraint satisfaction, even with extensive tuning of the energy function's coefficients. In this paper, we identify that this suboptimal balance arises from sampling in continuous space rather than the natural discrete space of text tokens. To address this, we propose Discrete Auto-regressive Biasing, a controlled decoding algorithm that leverages gradients while operating entirely in the discrete text domain. Specifically, we introduce a new formulation for controlled text generation by defining a joint distribution over the generated sequence and an auxiliary bias sequence. To efficiently sample from this joint distribution, we propose a Langevin-within-Gibbs sampling algorithm using gradient-based discrete MCMC. Our method significantly improves constraint satisfaction while maintaining comparable or better fluency, all with even lower computational costs. We demonstrate the advantages of our controlled decoding method on sentiment control, language detoxification, and keyword-guided generation.

Problem

Research questions and friction points this paper is trying to address.

Improves fluency and constraint balance

Operates in discrete text space

Reduces computational costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete Auto-regressive Biasing

Langevin-within-Gibbs sampling

Gradient-based discrete MCMC

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

PhD GenAI Research Scientist Intern

Databricks

SF Bay Area Hourly Rate$54—$60 USD

San Francisco, CA, USA

AI Research Scientist, Language - Monetization GenAI