🤖 AI Summary
This work addresses the challenges in training energy-based generative models (EBMs), where maximum likelihood estimation is computationally expensive and score-matching approaches often suffer from mode collapse. To overcome these limitations, we propose Diffusion Classification Loss (DiffCLF), which reformulates EBM training as a supervised classification task across noise levels—a formulation that, to our knowledge, is introduced here for the first time. This approach efficiently captures multimodal distributions while maintaining computational tractability. DiffCLF seamlessly integrates with standard score-matching objectives, significantly enhancing both sample fidelity and mode coverage. Empirical results demonstrate that our method accurately recovers the underlying energy function in Gaussian mixture models and outperforms existing techniques in tasks such as model composition and Boltzmann generator sampling.
📝 Abstract
Score-based generative models have recently achieved remarkable success. While they are usually parameterized by the score, an alternative way is to use a series of time-dependent energy-based models (EBMs), where the score is obtained from the negative input-gradient of the energy. Crucially, EBMs can be leveraged not only for generation, but also for tasks such as compositional sampling or building Boltzmann Generators via Monte Carlo methods. However, training EBMs remains challenging. Direct maximum likelihood is computationally prohibitive due to the need for nested sampling, while score matching, though efficient, suffers from mode blindness. To address these issues, we introduce the Diffusive Classification (DiffCLF) objective, a simple method that avoids blindness while remaining computationally efficient. DiffCLF reframes EBM learning as a supervised classification problem across noise levels, and can be seamlessly combined with standard score-based objectives. We validate the effectiveness of DiffCLF by comparing the estimated energies against ground truth in analytical Gaussian mixture cases, and by applying the trained models to tasks such as model composition and Boltzmann Generator sampling. Our results show that DiffCLF enables EBMs with higher fidelity and broader applicability than existing approaches.