JAMUN: Transferable Molecular Conformational Ensemble Generation with Walk-Jump Sampling

📅 2024-10-18
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF

career value

207K/year
🤖 AI Summary
To address the dual bottlenecks of high computational cost in molecular dynamics (MD) simulations and poor generalizability of existing machine learning (ML) methods for protein conformational ensemble sampling, this work introduces the first generalizable point-cloud modeling framework for efficient sampling from the Boltzmann distribution of arbitrary proteins. Our core innovation extends the Walk-Jump sampling paradigm to molecular point clouds, integrating universal noise modeling, point-cloud diffusion generation, and geometrically invariant representation learning—ensuring robust cross-protein generalization. Experiments demonstrate that our method accelerates sampling by two to three orders of magnitude over both conventional MD and state-of-the-art ML approaches. Moreover, it successfully predicts stable conformational basins for unseen small peptides not present in training, thereby substantially advancing the frontiers of both sampling efficiency and generalizability in conformational generation.

Technology Category

Application Category

📝 Abstract
Conformational ensembles of protein structures are immensely important both to understanding protein function, and for drug discovery in novel modalities such as cryptic pockets. Current techniques for sampling ensembles are computationally inefficient, or do not transfer to systems outside their training data. We present walk-Jump Accelerated Molecular ensembles with Universal Noise (JAMUN), a step towards the goal of efficiently sampling the Boltzmann distribution of arbitrary proteins. By extending Walk-Jump Sampling to point clouds, JAMUN enables ensemble generation at orders of magnitude faster rates than traditional molecular dynamics or state-of-the-art ML methods. Further, JAMUN is able to predict the stable basins of small peptides that were not seen during training.
Problem

Research questions and friction points this paper is trying to address.

Inefficient sampling of protein conformational ensembles
Limited transferability of machine learning methods
Slow generation of molecular dynamics ensembles
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines smoothed MD with score-based learning
Uses walk-jump sampling for faster ensembles
Transfers learning to untrained peptide systems
🔎 Similar Papers
No similar papers found.