JAMUN: Transferable Molecular Conformational Ensemble Generation with Walk-Jump Sampling

๐Ÿ“… 2024-10-18
๐Ÿ›๏ธ arXiv.org
๐Ÿ“ˆ Citations: 1
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the dual bottlenecks of high computational cost in molecular dynamics (MD) simulations and poor generalizability of existing machine learning (ML) methods for protein conformational ensemble sampling, this work introduces the first generalizable point-cloud modeling framework for efficient sampling from the Boltzmann distribution of arbitrary proteins. Our core innovation extends the Walk-Jump sampling paradigm to molecular point clouds, integrating universal noise modeling, point-cloud diffusion generation, and geometrically invariant representation learningโ€”ensuring robust cross-protein generalization. Experiments demonstrate that our method accelerates sampling by two to three orders of magnitude over both conventional MD and state-of-the-art ML approaches. Moreover, it successfully predicts stable conformational basins for unseen small peptides not present in training, thereby substantially advancing the frontiers of both sampling efficiency and generalizability in conformational generation.

Technology Category

Application Category

๐Ÿ“ Abstract
Conformational ensembles of protein structures are immensely important both to understanding protein function, and for drug discovery in novel modalities such as cryptic pockets. Current techniques for sampling ensembles are computationally inefficient, or do not transfer to systems outside their training data. We present walk-Jump Accelerated Molecular ensembles with Universal Noise (JAMUN), a step towards the goal of efficiently sampling the Boltzmann distribution of arbitrary proteins. By extending Walk-Jump Sampling to point clouds, JAMUN enables ensemble generation at orders of magnitude faster rates than traditional molecular dynamics or state-of-the-art ML methods. Further, JAMUN is able to predict the stable basins of small peptides that were not seen during training.
Problem

Research questions and friction points this paper is trying to address.

Inefficient sampling of protein conformational ensembles
Limited transferability of machine learning methods
Slow generation of molecular dynamics ensembles
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines smoothed MD with score-based learning
Uses walk-jump sampling for faster ensembles
Transfers learning to untrained peptide systems
๐Ÿ”Ž Similar Papers
No similar papers found.
Ameya Daigavane
Ameya Daigavane
Massachusetts Institute of Technology
Machine LearningVisualizationScientific Computing
B
Bodhi P. Vani
Genentech Computational Sciences, South San Francisco, CA, USA
S
Saeed Saremi
Genentech Computational Sciences, South San Francisco, CA, USA
J
Joseph Kleinhenz
Genentech Computational Sciences, South San Francisco, CA, USA
J
Joshua Rackers
Genentech Computational Sciences, South San Francisco, CA, USA