Energy-based generator matching: A neural sampler for general state space

📅 2025-05-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses generative modeling in the data-free setting: given only an energy function, it enables modeling of arbitrary continuous-time Markov processes—including diffusion, flow, and jump processes—and unifies generation of continuous, discrete, and hybrid-modal data. Methodologically, we generalize the generator-matching framework to generic continuous-time Markov processes for the first time, introducing self-normalized importance sampling and guided resampling to substantially reduce estimator variance. The target distribution is implicitly defined by the energy function, requiring neither observed data nor explicit normalization constants. Experiments demonstrate effectiveness on discrete tasks up to 100 dimensions and hybrid-modal tasks up to 20 dimensions. To our knowledge, this establishes the first general-purpose, data-free, energy-based generative framework supporting multimodal data generation.

Technology Category

Application Category

📝 Abstract
We propose Energy-based generator matching (EGM), a modality-agnostic approach to train generative models from energy functions in the absence of data. Extending the recently proposed generator matching, EGM enables training of arbitrary continuous-time Markov processes, e.g., diffusion, flow, and jump, and can generate data from continuous, discrete, and a mixture of two modalities. To this end, we propose estimating the generator matching loss using self-normalized importance sampling with an additional bootstrapping trick to reduce variance in the importance weight. We validate EGM on both discrete and multimodal tasks up to 100 and 20 dimensions, respectively.
Problem

Research questions and friction points this paper is trying to address.

Trains generative models without data using energy functions
Enables training of various continuous-time Markov processes
Generates data from continuous, discrete, and mixed modalities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modality-agnostic generative model training
Self-normalized importance sampling technique
Handles continuous discrete mixed data
Dongyeop Woo
Dongyeop Woo
KAIST
Machine learning
M
Minsu Kim
Korea Advanced Institute of Science and Technology (KAIST), Mila - Quebec AI Institute
M
Minkyu Kim
Korea Advanced Institute of Science and Technology (KAIST)
Kiyoung Seong
Kiyoung Seong
M.Sc. student, KAIST
AI for Science
Sungsoo Ahn
Sungsoo Ahn
KAIST
Machine Learning