DIP: Dynamic In-Context Planner For Diffusion Language Models

๐Ÿ“… 2026-01-06
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Diffusion language models suffer from high computational costs in long-context scenarios due to their bidirectional attention mechanism. This work proposes DIP, a dynamic context planner that, for the first time, enables dynamic context optimization during the generation process of diffusion language models. By employing a context planning strategy at inference time, DIP selectively inserts exemplars on demand, thereby overcoming the limitations of conventional static prompting. Experimental results demonstrate that DIP achieves up to a 12.9ร— speedup over standard inference while maintaining generation quality, and is 1.17ร— faster than KV cacheโ€“enhanced inference.

Technology Category

Application Category

๐Ÿ“ Abstract
Diffusion language models (DLMs) have shown strong potential for general natural language tasks with in-context examples. However, due to the bidirectional attention mechanism, DLMs incur substantial computational cost as context length increases. This work addresses this issue with a key discovery: unlike the sequential generation in autoregressive language models (ARLMs), the diffusion generation paradigm in DLMs allows \textit{efficient dynamic adjustment of the context} during generation. Building on this insight, we propose \textbf{D}ynamic \textbf{I}n-Context \textbf{P}lanner (DIP), a context-optimization method that dynamically selects and inserts in-context examples during generation, rather than providing all examples in the prompt upfront. Results show DIP maintains generation quality while achieving up to 12.9$\times$ inference speedup over standard inference and 1.17$\times$ over KV cache-enhanced inference.
Problem

Research questions and friction points this paper is trying to address.

diffusion language models
computational cost
context length
in-context learning
bidirectional attention
Innovation

Methods, ideas, or system contributions that make the work stand out.

Diffusion Language Models
Dynamic In-Context Planning
Efficient Inference
Context Optimization
Bidirectional Attention
๐Ÿ”Ž Similar Papers
No similar papers found.