Generative Pseudo-Force Fields for Molecular Generation

📅 2026-05-18
📈 Citations: 0
Influential: 0
📄 PDF

career value

197K/year
🤖 AI Summary
This work addresses the trade-off between physical realism and computational efficiency in molecular conformation generation. Traditional approaches rely on expensive ab initio calculations, while diffusion models are constrained by explicit timestep conditioning. The authors propose Generative Pseudo Force Fields (GPFF), which construct a quadratic pseudo potential energy surface around reference equilibrium geometries, enabling online generation of non-equilibrium training data without ab initio evaluations for perturbed conformations. By integrating the diffusion process with machine-learned force fields, GPFF implicitly encodes noise levels, thereby eliminating explicit timestep dependencies. The framework supports both standard and adaptive sampling and naturally incorporates structural priors and geometric constraints. On QM9, it achieves over 50% validity with only six neural function evaluations and reaches 100% validity at 256 evaluations, demonstrating high-precision real-time generation in drug molecule editing applications.
📝 Abstract
Generating stable molecular conformations typically forces a tradeoff between the physical realism of energy-based relaxation and the sampling efficiency of data-driven generative models. While machine learning force fields (MLFFs) can sample stable conformations by relaxing molecular geometries according to physical forces, they require costly ab-initio training data. Conversely, diffusion models (DMs) learn from equilibrium data alone but are dependent on noise schedules and time-step conditioning. In this work, we propose generative pseudo-force fields (GPFFs) to bridge these paradigms by training an MLFF on a quadratic pseudo-potential energy surface relative to reference equilibrium structures. Because no ab-initio calculations are required for the perturbed geometries, non-equilibrium training data can be generated on the fly by perturbing the equilibria with Gaussian noise. We show that GPFFs constitute a time-step-agnostic variant of variance exploding DMs: the score comes from the predicted pseudo-forces but because force magnitudes implicitly encode the noise level, no time-step conditioning is needed. Our GPFF can hence be used as a drop-in replacement in standard diffusion sampling (ancestral, Heun) but also facilitates more efficient, adaptive variants and an MLFF inspired direct denoising scheme. Our proposed sampling algorithms support arbitrary structural priors and geometric constraints. On QM9, GPFF has 100 % validity at 256 neural function evaluations (NFE) and over 50 % at just 6 NFE, outperforming diffusion baselines across all samplers. Combined with custom priors, we showcase the fast and accurate generation process of our method in a molecular editor for a drug design setting, where a molecule is generated in real time.
Problem

Research questions and friction points this paper is trying to address.

molecular generation
force fields
diffusion models
conformational sampling
generative modeling
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative Pseudo-Force Fields
Diffusion Models
Machine Learning Force Fields
Molecular Conformation Generation
Score-Based Generative Modeling
🔎 Similar Papers
No similar papers found.
S
Stefaan Simon Pierre Hessmann
Machine Learning Group, Technische Universität Berlin, Berlin, Germany; BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
Khaled Kahouli
Khaled Kahouli
Machine Learning Group, TU Berlin
Machine LearningGenerative ModelsComputational Chemistry
Stefan Gugler
Stefan Gugler
Postdoc at TU Berlin
Machine Learning for Quantum ChemistryTheoretical Chemistry
Michael Plainer
Michael Plainer
Free University of Berlin, Technical University of Berlin, ELIZA, BIFOLD
Machine LearningGenerative ModelsAI4Science
F
Frank Noé
Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany; Department of Physics, Freie Universität Berlin, Berlin, Germany; Microsoft Research AI4Science, Berlin, Germany; Department of Chemistry, Rice University, Houston, USA
Klaus-Robert Müller
Klaus-Robert Müller
TU Berlin & Korea University & Google DeepMind & Max Planck Institute for Informatics, Germany
Machine learningartificial intelligencebig datacomputational neuroscience
N
Niklas Wolf Andreas Gebauer
Machine Learning Group, Technische Universität Berlin, Berlin, Germany; BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany