Glauber Generative Model: Discrete Diffusion Models via Binary Classification

📅 2024-05-27

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

192K/year

🤖 AI Summary

This work addresses generative modeling for discrete data—such as text and discretized images—by proposing a novel discrete diffusion model grounded in Glauber dynamics. Methodologically, it formulates the denoising process as a token-wise binary classification task (distinguishing signal from noise), enabling analytically tractable, exact discrete Markov chain modeling without variational approximations or importance sampling; it is compatible with general-purpose tokenizers (e.g., VQGAN) and requires no dataset-specific architectural design. The key contributions are: (i) the first formulation of discrete diffusion as analytically solvable, token-level discriminative learning, markedly improving generation efficiency and controllability; (ii) state-of-the-art performance in language generation among discrete diffusion models; and (iii) zero-shot text and image infilling capabilities, with image synthesis achieved without retraining the tokenizer.

Technology Category

Application Category

📝 Abstract

We introduce the Glauber Generative Model (GGM), a new class of discrete diffusion models, to obtain new samples from a distribution given samples from a discrete space. GGM deploys a discrete Markov chain called the heat bath dynamics (or the Glauber dynamics) to denoise a sequence of noisy tokens to a sample from a joint distribution of discrete tokens. Our novel conceptual framework provides an exact reduction of the task of learning the denoising Markov chain to solving a class of binary classification tasks. More specifically, the model learns to classify a given token in a noisy sequence as signal or noise. In contrast, prior works on discrete diffusion models either solve regression problems to learn importance ratios, or minimize loss functions given by variational approximations. We apply GGM to language modeling and image generation, where images are discretized using image tokenizers like VQGANs. We show that it outperforms existing discrete diffusion models in language generation, and demonstrates strong performance for image generation without using dataset-specific image tokenizers. We also show that our model is capable of performing well in zero-shot control settings like text and image infilling.

Problem

Research questions and friction points this paper is trying to address.

Introduces Glauber Generative Model for discrete diffusion tasks

Reduces denoising to binary classification of signal vs noise

Applies model to language and image generation outperforming existing methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses Glauber dynamics for discrete denoising

Reduces learning to binary classification tasks

Applies to language and image generation effectively

🔎 Similar Papers

Diffusion Models: A Comprehensive Survey of Methods and Applications