Generating from Discrete Distributions Using Diffusions: Insights from Random Constraint Satisfaction Problems

📅 2026-03-20

📈 Citations: 0

✨ Influential: 0

career value

199K/year

🤖 AI Summary

This work investigates efficient methods for uniformly sampling solutions to random k-SAT or k-XORSAT formulas to enhance the performance of discrete generative models on synthetic constraint satisfaction problem (CSP) benchmarks. The authors systematically compare continuous diffusion with masked discrete diffusion strategies and examine the impact of variable ordering on generation quality. Experimental results demonstrate that continuous diffusion models not only achieve the theoretically optimal accuracy but also significantly outperform existing discrete approaches. Moreover, specific variable orderings substantially improve generation fidelity without relying on conventional heuristic rules. These findings reveal non-intuitive influences of CSP theory on generative model behavior and offer a novel perspective for modeling discrete diffusion processes.

Technology Category

Application Category

📝 Abstract

Generating data from discrete distributions is important for a number of application domains including text, tabular data, and genomic data. Several groups have recently used random $k$-satisfiability ($k$-SAT) as a synthetic benchmark for new generative techniques. In this paper, we show that fundamental insights from the theory of random constraint satisfaction problems have observable implications (sometime contradicting intuition) on the behavior of generative techniques on such benchmarks. More precisely, we study the problem of generating a uniformly random solution of a given (random) $k$-SAT or $k$-XORSAT formula. Among other findings, we observe that: $(i)$~Continuous diffusions outperform masked discrete diffusions; $(ii)$~Learned diffusions can match the theoretical `ideal' accuracy; $(iii)$~Smart ordering of the variables can significantly improve accuracy, although not following popular heuristics.

Problem

Research questions and friction points this paper is trying to address.

discrete distributions

generative modeling

k-SAT

constraint satisfaction problems

random sampling

Innovation

Methods, ideas, or system contributions that make the work stand out.

discrete diffusion

random k-SAT

constraint satisfaction problems