🤖 AI Summary
Discrete non-negative ordinal data lack a unified modeling paradigm analogous to the Tweedie formula in continuous spaces, making it challenging to simultaneously achieve denoising, sampling, and exact likelihood estimation. This work proposes Binomial Flows, a novel framework that establishes, for the first time, a theoretical connection between denoising and flow matching for discrete ordinal data. By leveraging binomial flows, the method introduces a unified training and inference mechanism that enables efficient denoising, exact likelihood computation, and high-quality generation. Experiments validate the theoretical soundness of the approach on synthetic data and demonstrate competitive generative performance across multiple real-world datasets.
📝 Abstract
Flow-based generative modeling in continuous spaces exploit Tweedie's formula to express the denoiser (learned in training) as a score function (used in sampling). In contrast, this relation has been largely missing in the discrete setting where common approaches focus on learning discrete scores and rates. In this work we close this gap for discrete non-negative ordinal data by introducing Binomial flows. Our framework provides a simple recipe for training a discrete diffusion model which simultaneously denoises, samples, and estimates exact likelihoods. We verify our methodology on synthetic examples and obtain competitive results on real-world data sets.