Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion

📅 2026-05-22

📈 Citations: 0

✨ Influential: 0

career value

232K/year

🤖 AI Summary

Efficient sampling from reward-tilted distributions in discrete diffusion models remains hindered by the costly Monte Carlo approximations required to estimate the optimal twist function. This work proposes the Contrastive Distribution Matching (CDM) framework, which introduces contrastive learning into sequential Monte Carlo (SMC) inference for discrete sequences for the first time. By learning a parameterized twist function through positive and negative samples, CDM amortizes inference costs and further enhances gradient estimation by leveraging closed-form solutions of the discrete diffusion forward kernel. The method achieves significantly reduced inference overhead—adding less than 5% computational cost—while preserving sample quality, and substantially outperforms existing baselines under identical runtime budgets. It demonstrates strong empirical performance across diverse applications, including detoxified text generation, regulatory DNA design, protein designability optimization, and alignment of diffusion-based large language models.

📝 Abstract

Discrete diffusion models have emerged as powerful frameworks for generating structured categorical data. However, efficiently sampling from reward-tilted distributions remains a fundamental challenge. While Twisted Sequential Monte Carlo (SMC) offers asymptotic exactness for this task, estimating the optimal twist function in discrete state spaces necessitates costly Monte Carlo approximations, resulting a severe computational bottleneck at inference. To overcome this limitation, we introduce Contrastive Distribution Matching (CDM), a novel framework that amortizes the cost of SMC inference by learning a parameterized twist function via positive and negative samples. For efficient training, we reformulate the gradient estimator to leverage the closed-form forward kernels of discrete diffusion models. In practice, evaluating our learned twist function incurs less than 5% additional computational overhead compared to a single forward pass of the base model. Through extensive empirical evaluations, we demonstrate that CDM consistently outperforms existing baselines under matched wall-clock time. We validate the effectiveness and versatility of our approach across a diverse range of applications, including toxic text generation, regulatory DNA sequence design, protein designability, and diffusion large language model alignment.

Problem

Research questions and friction points this paper is trying to address.

discrete diffusion

reward-tilted distributions

Sequential Monte Carlo

twist function

inference efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

Contrastive Distribution Matching

Amortized Inference

Discrete Diffusion Models