🤖 AI Summary
Diffusion language models (DLMs) face a combinatorial search challenge in parallel denoising inference—jointly optimizing mask position selection and token commitment—where existing heuristic or auxiliary-training-based approaches struggle to balance efficiency and generation quality. This paper proposes MEDAL, the first framework to integrate Monte Carlo Tree Search (MCTS) into the DLM initialization phase. MEDAL introduces a confidence-guided action space pruning mechanism and a residual-mask-bit prioritization strategy, enabling training-free, principled, search-based inference. Evaluated across multiple benchmarks, MEDAL achieves up to a 22.0% improvement over state-of-the-art methods, significantly enhancing both local token accuracy and global sequence coherence. By unifying search-based reasoning with diffusion-based language modeling, MEDAL establishes a novel paradigm for DLM inference that is theoretically grounded, empirically effective, and computationally efficient.
📝 Abstract
Diffusion language models (DLMs) have recently emerged as a compelling alternative to autoregressive generation, offering parallel generation and improved global coherence. During inference, DLMs generate text by iteratively denoising masked sequences in parallel; however, determining which positions to unmask and which tokens to commit forms a large combinatorial search problem. Existing inference methods approximate this search using heuristics, which often yield suboptimal decoding paths; other approaches instead rely on additional training to guide token selection. To introduce a principled search mechanism for DLMs inference, we introduce MEDAL, a framework that integrates Monte Carlo Tree SEarch initialization for Diffusion LAnguage Model inference. We employ Monte Carlo Tree Search at the initialization stage to explore promising unmasking trajectories, providing a robust starting point for subsequent refinement. This integration is enabled by restricting the search space to high-confidence actions and prioritizing token choices that improve model confidence over remaining masked positions. Across multiple benchmarks, MEDAL achieves up to 22.0% improvement over existing inference strategies, establishing a new paradigm for search-based inference in diffusion language models.