Glauber Generative Model: Discrete Diffusion Models via Binary Classification

๐Ÿ“… 2024-05-27
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 2
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This work addresses generative modeling for discrete dataโ€”such as text and discretized imagesโ€”by proposing a novel discrete diffusion model grounded in Glauber dynamics. Methodologically, it formulates the denoising process as a token-wise binary classification task (distinguishing signal from noise), enabling analytically tractable, exact discrete Markov chain modeling without variational approximations or importance sampling; it is compatible with general-purpose tokenizers (e.g., VQGAN) and requires no dataset-specific architectural design. The key contributions are: (i) the first formulation of discrete diffusion as analytically solvable, token-level discriminative learning, markedly improving generation efficiency and controllability; (ii) state-of-the-art performance in language generation among discrete diffusion models; and (iii) zero-shot text and image infilling capabilities, with image synthesis achieved without retraining the tokenizer.

Technology Category

Application Category

๐Ÿ“ Abstract
We introduce the Glauber Generative Model (GGM), a new class of discrete diffusion models, to obtain new samples from a distribution given samples from a discrete space. GGM deploys a discrete Markov chain called the heat bath dynamics (or the Glauber dynamics) to denoise a sequence of noisy tokens to a sample from a joint distribution of discrete tokens. Our novel conceptual framework provides an exact reduction of the task of learning the denoising Markov chain to solving a class of binary classification tasks. More specifically, the model learns to classify a given token in a noisy sequence as signal or noise. In contrast, prior works on discrete diffusion models either solve regression problems to learn importance ratios, or minimize loss functions given by variational approximations. We apply GGM to language modeling and image generation, where images are discretized using image tokenizers like VQGANs. We show that it outperforms existing discrete diffusion models in language generation, and demonstrates strong performance for image generation without using dataset-specific image tokenizers. We also show that our model is capable of performing well in zero-shot control settings like text and image infilling.
Problem

Research questions and friction points this paper is trying to address.

Introduces Glauber Generative Model for discrete diffusion tasks
Reduces denoising to binary classification of signal vs noise
Applies model to language and image generation outperforming existing methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Glauber dynamics for discrete denoising
Reduces learning to binary classification tasks
Applies to language and image generation effectively
๐Ÿ”Ž Similar Papers
2022-09-02ACM Computing SurveysCitations: 1628