🤖 AI Summary
Existing denoising diffusion and flow matching methods struggle to jointly satisfy differentiability, efficiency, and distribution fidelity when modeling tabular data with mixed continuous and discrete features.
Method: We propose Exponential Family Variational Flow Matching (EF-VFM), the first flow matching framework generalized to heterogeneous tabular data. EF-VFM establishes a theoretical connection to Bregman divergence–driven generalized flow matching and introduces a unified exponential family distribution modeling scheme for end-to-end joint generation of mixed-type features. Its data-driven moment-matching objective enables efficient, fully differentiable optimization.
Results: On multiple standard tabular benchmarks, EF-VFM achieves state-of-the-art performance—outperforming existing diffusion and flow matching baselines across generation quality, training efficiency, and fidelity of both marginal and joint distributions—demonstrating balanced improvements across all three dimensions.
📝 Abstract
While denoising diffusion and flow matching have driven major advances in generative modeling, their application to tabular data remains limited, despite its ubiquity in real-world applications. To this end, we develop TabbyFlow, a variational Flow Matching (VFM) method for tabular data generation. To apply VFM to data with mixed continuous and discrete features, we introduce Exponential Family Variational Flow Matching (EF-VFM), which represents heterogeneous data types using a general exponential family distribution. We hereby obtain an efficient, data-driven objective based on moment matching, enabling principled learning of probability paths over mixed continuous and discrete variables. We also establish a connection between variational flow matching and generalized flow matching objectives based on Bregman divergences. Evaluation on tabular data benchmarks demonstrates state-of-the-art performance compared to baselines.