DINGO: Constrained Inference for Diffusion LLMs

📅 2025-05-29
📈 Citations: 0
✨ Influential: 0
📄 PDF
🤖 AI Summary
Diffusion large language models (Diffusion LLMs) struggle to strictly satisfy user-specified syntactic constraints—e.g., regular expressions—resulting in unreliable structured outputs such as JSON. To address this, we propose the first dynamic programming–driven, distribution-preserving constrained decoding method for Diffusion LLMs. Our approach compiles regular expressions into finite automata and performs parallel token-block–level constraint pruning during the diffusion denoising process. Crucially, it guarantees both strict adherence to the constraints (100% constraint satisfaction) and exact preservation of the original model’s output distribution—without fine-tuning, reparameterization, or architectural modification—and remains compatible with arbitrary Diffusion LLMs. Evaluated on symbolic mathematics and JSON generation benchmarks, our method improves constraint satisfaction rates by up to 68 percentage points over unconstrained baselines, while rigorously maintaining the pre-trained model’s generative distribution.

Technology Category

Application Category

📝 Abstract
Diffusion LLMs have emerged as a promising alternative to conventional autoregressive LLMs, offering significant potential for improved runtime efficiency. However, existing diffusion models lack the ability to provably enforce user-specified formal constraints, such as regular expressions, which makes them unreliable for tasks that require structured outputs, such as fixed-schema JSON generation. Unlike autoregressive models that generate tokens sequentially, diffusion LLMs predict a block of tokens in parallel. This parallelism makes traditional constrained decoding algorithms, which are designed for sequential token prediction, ineffective at preserving the true output distribution. To address this limitation, we propose DINGO, a dynamic programming-based constrained decoding strategy that is both efficient and provably distribution-preserving. DINGO enables sampling of output strings with the highest probability under the model's predicted distribution, while strictly satisfying any user-specified regular expression. On standard symbolic math and JSON generation benchmarks, DINGO achieves up to a 68 percentage point improvement over unconstrained inference
Problem

Research questions and friction points this paper is trying to address.

Enforcing formal constraints in diffusion LLMs
Preserving output distribution during constrained decoding
Enabling structured outputs like fixed-schema JSON
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic programming-based constrained decoding strategy
Efficient and provably distribution-preserving sampling
Strictly satisfies user-specified regular expressions
🔎 Similar Papers
No similar papers found.