🤖 AI Summary
This work addresses conditional modeling of high-dimensional probability distributions by proposing a unified generative framework that jointly synthesizes conditional distributions across level sets of a collective variable ξ: ℝᵈ → ℝᵏ (k < d). To overcome inaccurate modeling caused by sparse sampling in low-probability level sets, we introduce an augmentation-driven data enrichment strategy enabling joint conditional density estimation across multiple level sets. The method integrates generative modeling, collective variable analysis, and conditional density estimation—constituting the first end-to-end approach for learning conditional distributions over multiple level sets simultaneously. Extensive numerical experiments demonstrate substantial improvements in generation fidelity and generalization performance within rare conformational regions. The framework establishes a novel paradigm for efficient data augmentation and high-precision conditional modeling, with direct applicability to molecular simulation and related computational science domains.
📝 Abstract
Given a probability distribution $μ$ in $mathbb{R}^d$ represented by data, we study in this paper the generative modeling of its conditional probability distributions on the level-sets of a collective variable $ξ: mathbb{R}^d
ightarrow mathbb{R}^k$, where $1 le k<d$. We propose a general and effcient learning approach that is able to learn generative models on different level-sets of $ξ$ simultaneously. To improve the learning quality on level-sets in low-probability regions, we also propose a strategy for data enrichment by utilizing data from enhanced sampling techniques. We demonstrate the effectiveness of our proposed learning approach through concrete numerical examples. The proposed approach is potentially useful for the generative modeling of molecular systems in biophysics, for instance.