Diffusion Large Language Models for Black-Box Optimization

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of offline black-box optimization under extreme data scarcity, where existing methods struggle to efficiently generate high-performance designs. We propose the first approach that leverages a diffusion-based large language model for this task, integrating natural language prompts to fuse task descriptions with offline data and introducing a context-aware denoising module to produce high-quality candidate solutions. The core innovation lies in a novel masked diffusion tree search strategy that dynamically balances exploration and exploitation, enabling iterative refinement guided by expected improvement under a Gaussian process surrogate. Evaluated on the Design-Bench benchmark under few-shot settings, our method significantly outperforms prior state-of-the-art approaches, achieving the best reported performance to date.

Technology Category

Application Category

📝 Abstract
Offline black-box optimization (BBO) aims to find optimal designs based solely on an offline dataset of designs and their labels. Such scenarios frequently arise in domains like DNA sequence design and robotics, where only a few labeled data points are available. Traditional methods typically rely on task-specific proxy or generative models, overlooking the in-context learning capabilities of pre-trained large language models (LLMs). Recent efforts have adapted autoregressive LLMs to BBO by framing task descriptions and offline datasets as natural language prompts, enabling direct design generation. However, these designs often contain bidirectional dependencies, which left-to-right models struggle to capture. In this paper, we explore diffusion LLMs for BBO, leveraging their bidirectional modeling and iterative refinement capabilities. This motivates our in-context denoising module: we condition the diffusion LLM on the task description and the offline dataset, both formatted in natural language, and prompt it to denoise masked designs into improved candidates. To guide the generation toward high-performing designs, we introduce masked diffusion tree search, which casts the denoising process as a step-wise Monte Carlo Tree Search that dynamically balances exploration and exploitation. Each node represents a partially masked design, each denoising step is an action, and candidates are evaluated via expected improvement under a Gaussian Process trained on the offline dataset. Our method, dLLM, achieves state-of-the-art results in few-shot settings on design-bench.
Problem

Research questions and friction points this paper is trying to address.

black-box optimization
offline dataset
bidirectional dependencies
few-shot design
diffusion models
Innovation

Methods, ideas, or system contributions that make the work stand out.

diffusion LLM
black-box optimization
in-context denoising
masked diffusion tree search
bidirectional modeling
🔎 Similar Papers
2024-06-27International Conference on Learning RepresentationsCitations: 18
Ye Yuan
Ye Yuan
McGill University, Mila - Quebec AI Institute
Generative ModelingBlack Box OptimizationKnowledge-Centric NLPLLMs
C
Can Chen
MILA - Quebec AI Institute
Z
Zipeng Sun
McGill, MILA - Quebec AI Institute
D
Dinghuai Zhang
MILA - Quebec AI Institute, Microsoft Research
C
C. Pal
MILA - Quebec AI Institute, Polytechnique Montreal, Canada CIFAR AI Chair
X
Xue Liu
McGill, MILA - Quebec AI Institute