Dream-Coder 7B: An Open Diffusion Language Model for Code

πŸ“… 2025-09-01
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
Traditional autoregressive (AR) code models are constrained by a fixed left-to-right decoding order, limiting their applicability to sketch-first programming, interactive code completion, or interleaved reasoning. This work introduces Dream-Coder 7Bβ€”the first open-source discrete diffusion language model enabling arbitrary-order code generation. Methodologically, it pioneers task-adaptive non-monotonic decoding for code synthesis and proposes a continuous-time cross-entropy objective alongside a novel reinforcement learning framework tailored for discrete diffusion, integrating AR pretraining, stochastic truncation, padding penalty, and verifiable reward signals. Evaluated on LiveCodeBench, Dream-Coder 7B achieves 21.4% pass@1, significantly outperforming same-scale AR baselines on HumanEval, MBPP, and other benchmarks. The model architecture, training pipeline, and inference code are fully open-sourced.

Technology Category

Application Category

πŸ“ Abstract
We present Dream-Coder 7B, an open-source discrete diffusion language model for code generation that exhibits emergent any-order generation capabilities. Unlike traditional autoregressive (AR) models that decode strictly left-to-right, Dream-Coder 7B adaptively determines its decoding strategy based on the coding task: sketch-first generation for complex algorithms, left-to-right generation for straightforward completions, and interleaved reasoning generation for code understanding tasks. We adapt a pretrained AR checkpoint to a discrete diffusion frameworks with a continuous-time weighted cross-entropy objective. Our post-training recipe comprises (i) supervised fine-tuning, where we mitigate padding pathologies via random truncation and a padding penalty to improve sample efficiency and stabilize generation; and (ii) reinforcement learning with verifiable rewards over a curated high-quality prompt set drawn from open-source datasets, using a tailored reinforcement learning recipe for diffusion language models. The resulting Dream-Coder 7B Instruct attains 21.4% pass@1 on LiveCodeBench (2410--2505) and demonstrates competitive performance on HumanEval, MBPP, BigCodeBench, and CRUXEval. We release Dream-Coder-7B and Dream-Coder-7B-Instruct checkpoints, training recipes, preprocessing pipelines, and inference code to facilitate reproducibility and further research.
Problem

Research questions and friction points this paper is trying to address.

Develops open diffusion model for flexible code generation
Adapts decoding strategy based on coding task complexity
Enhances sample efficiency through tailored training techniques
Innovation

Methods, ideas, or system contributions that make the work stand out.

Discrete diffusion framework for code generation
Adaptive decoding strategy based on task type
Combined supervised fine-tuning with reinforcement learning
πŸ”Ž Similar Papers
No similar papers found.