Efficient Training of Boltzmann Generators Using Off-Policy Log-Dispersion Regularization

📅 2026-02-03

📈 Citations: 0

✨ Influential: 0

career value

236K/year

🤖 AI Summary

This work addresses the challenge of efficiently sampling from unnormalized probability densities—such as Boltzmann distributions—particularly in settings where energy evaluations and data acquisition are costly. The authors propose off-policy Log-Density Regularization (LDR), a generalization of the log-variance objective, which leverages existing data and their associated energy labels to regularize the shape of the energy landscape without incurring additional online sampling overhead. LDR is introduced for the first time into Boltzmann generator training and is compatible with both biased and unbiased simulation data as well as purely variational training paradigms. Experimental results demonstrate that the method achieves up to a tenfold improvement in sample efficiency across multiple benchmarks while significantly enhancing generation quality and convergence stability.

Technology Category

Application Category

📝 Abstract

Sampling from unnormalized probability densities is a central challenge in computational science. Boltzmann generators are generative models that enable independent sampling from the Boltzmann distribution of physical systems at a given temperature. However, their practical success depends on data-efficient training, as both simulation data and target energy evaluations are costly. To this end, we propose off-policy log-dispersion regularization (LDR), a novel regularization framework that builds on a generalization of the log-variance objective. We apply LDR in the off-policy setting in combination with standard data-based training objectives, without requiring additional on-policy samples. LDR acts as a shape regularizer of the energy landscape by leveraging additional information in the form of target energy labels. The proposed regularization framework is broadly applicable, supporting unbiased or biased simulation datasets as well as purely variational training without access to target samples. Across all benchmarks, LDR improves both final performance and data efficiency, with sample efficiency gains of up to one order of magnitude.

Problem

Research questions and friction points this paper is trying to address.

Boltzmann Generators

sampling

unnormalized probability densities

data-efficient training

energy landscape

Innovation

Methods, ideas, or system contributions that make the work stand out.

Boltzmann Generators

off-policy training

log-dispersion regularization