๐ค AI Summary
To address the dual bottlenecks of high computational cost in molecular dynamics (MD) simulations and poor generalizability of existing machine learning (ML) methods for protein conformational ensemble sampling, this work introduces the first generalizable point-cloud modeling framework for efficient sampling from the Boltzmann distribution of arbitrary proteins. Our core innovation extends the Walk-Jump sampling paradigm to molecular point clouds, integrating universal noise modeling, point-cloud diffusion generation, and geometrically invariant representation learningโensuring robust cross-protein generalization. Experiments demonstrate that our method accelerates sampling by two to three orders of magnitude over both conventional MD and state-of-the-art ML approaches. Moreover, it successfully predicts stable conformational basins for unseen small peptides not present in training, thereby substantially advancing the frontiers of both sampling efficiency and generalizability in conformational generation.
๐ Abstract
Conformational ensembles of protein structures are immensely important both to understanding protein function, and for drug discovery in novel modalities such as cryptic pockets. Current techniques for sampling ensembles are computationally inefficient, or do not transfer to systems outside their training data. We present walk-Jump Accelerated Molecular ensembles with Universal Noise (JAMUN), a step towards the goal of efficiently sampling the Boltzmann distribution of arbitrary proteins. By extending Walk-Jump Sampling to point clouds, JAMUN enables ensemble generation at orders of magnitude faster rates than traditional molecular dynamics or state-of-the-art ML methods. Further, JAMUN is able to predict the stable basins of small peptides that were not seen during training.