ILRR: Inference-Time Steering Method for Masked Diffusion Language Models

📅 2026-01-29

📈 Citations: 0

✨ Influential: 0

career value

200K/year

🤖 AI Summary

Existing discrete diffusion language models lack efficient inference-time control mechanisms, making it difficult to flexibly steer high-level semantic attributes of generated text. This work proposes ILRR—an inference-time guidance framework that requires no additional training—enabling precise semantic control by dynamically aligning the internal activations of the denoising sequence with those of a single reference sequence. The core innovations include iterative latent representation refinement and a spatial modulation guidance strategy, which together allow efficient adjustment of guidance strength with only a single forward pass. Experiments on LLaDA and MDLM demonstrate that ILRR improves attribute control accuracy by 10%–60% with minimal computational overhead while preserving generation quality.

Technology Category

Application Category

📝 Abstract

Discrete Diffusion Language Models (DLMs) offer a promising non-autoregressive alternative for text generation, yet effective mechanisms for inference-time control remain relatively underexplored. Existing approaches include sampling-level guidance procedures or trajectory optimization mechanisms. In this work, we introduce Iterative Latent Representation Refinement (ILRR), a learning-free framework for steering DLMs using a single reference sequence. ILRR guides generation by dynamically aligning the internal activations of the generated sequence with those of a given reference throughout the denoising process. This approach captures and transfers high-level semantic properties, with a tunable steering scale enabling flexible control over attributes such as sentiment. We further introduce Spatially Modulated Steering, an extension that enables steering long texts using shorter references by regulating guidance intensity across the sequence. Empirically, we demonstrate that ILRR achieves effective attribute steering on LLaDA and MDLM architectures with a minor computational overhead, requiring only one additional parallel forward pass per denoising step. Under the same compute budget, ILRR improves attribute accuracy over comparable baselines by 10$\%$ to 60$\%$ points, while maintaining high generation quality.

Problem

Research questions and friction points this paper is trying to address.

Discrete Diffusion Language Models

inference-time control

attribute steering

non-autoregressive generation

semantic guidance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Iterative Latent Representation Refinement

Inference-Time Steering

Discrete Diffusion Language Models