Potential Score Matching: Debiasing Molecular Structure Sampling with Potential Energy Guidance

๐Ÿ“… 2025-03-18
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Sampling molecular conformational distributions is essential for predicting physical properties, yet conventional molecular dynamics (MD) and Markov chain Monte Carlo (MCMC) methods are computationally expensive and often fail to satisfy ergodicity requirements. To address this, we propose Potential-guided Score Matching (PSM), a novel framework that explicitly incorporates potential energy gradients into the score estimation of diffusion modelsโ€”without requiring exact energy functions or ergodicity assumptions. PSM learns an unbiased Boltzmann distribution directly from limited or biased data, bypassing costly MD simulations and MCMC sampling altogether. This yields substantial gains in both sampling efficiency and physical consistency. On the Lennard-Jones (LJ) model, PSM surpasses state-of-the-art methods; on high-dimensional benchmarks MD17 and MD22, it generates conformational distributions significantly closer to the true Boltzmann distribution, with markedly improved sampling efficiency.

Technology Category

Application Category

๐Ÿ“ Abstract
The ensemble average of physical properties of molecules is closely related to the distribution of molecular conformations, and sampling such distributions is a fundamental challenge in physics and chemistry. Traditional methods like molecular dynamics (MD) simulations and Markov chain Monte Carlo (MCMC) sampling are commonly used but can be time-consuming and costly. Recently, diffusion models have emerged as efficient alternatives by learning the distribution of training data. Obtaining an unbiased target distribution is still an expensive task, primarily because it requires satisfying ergodicity. To tackle these challenges, we propose Potential Score Matching (PSM), an approach that utilizes the potential energy gradient to guide generative models. PSM does not require exact energy functions and can debias sample distributions even when trained on limited and biased data. Our method outperforms existing state-of-the-art (SOTA) models on the Lennard-Jones (LJ) potential, a commonly used toy model. Furthermore, we extend the evaluation of PSM to high-dimensional problems using the MD17 and MD22 datasets. The results demonstrate that molecular distributions generated by PSM more closely approximate the Boltzmann distribution compared to traditional diffusion models.
Problem

Research questions and friction points this paper is trying to address.

Debiasing molecular structure sampling using potential energy guidance.
Overcoming time and cost limitations of traditional molecular sampling methods.
Approximating Boltzmann distribution more accurately with limited, biased data.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Utilizes potential energy gradient for guidance
Debiases distributions with limited biased data
Outperforms SOTA models on molecular sampling
๐Ÿ”Ž Similar Papers
No similar papers found.
Liya Guo
Liya Guo
Tsinghua University
Z
Zun Wang
Microsoft Research AI4Science; Beijing, China
C
Chang Liu
Microsoft Research AI4Science; Beijing, China
J
Junzhe Li
School of Computer Science, Peking University; Beijing, China
Pipi Hu
Pipi Hu
Senior researcher, Microsoft Research AI4Science
Differential equation related neural networks
Y
Yi Zhu
Yau Mathematical Sciences Center, Tsinghua University; Yanqi Lake Beijing Institute of Mathematical Sciences and Applications; Beijing, China