Amortized Sampling with Transferable Normalizing Flows

📅 2025-08-25

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

To address the low sampling efficiency and poor transferability of molecular conformational equilibrium sampling across systems, this paper introduces Prose—a transferable normalizing flow model. Built upon a full-atom architecture with 280 million parameters, Prose is trained on large-scale peptide molecular dynamics trajectories and fine-tuned via importance sampling. It achieves, for the first time, zero-shot, decorrelated conformational generation across varying sequence lengths without retraining or system-specific sampling. On unseen tetrapeptide systems, Prose significantly outperforms conventional methods such as sequence Monte Carlo. Its core innovation lies in unifying exact likelihood modeling and amortized sampling within a scalable, transferable deep generative framework, enabling efficient equilibrium sampling for arbitrary peptide chains. The model weights, source code, and datasets are publicly released, establishing a new paradigm for amortized conformational sampling.

Technology Category

Application Category

📝 Abstract

Efficient equilibrium sampling of molecular conformations remains a core challenge in computational chemistry and statistical inference. Classical approaches such as molecular dynamics or Markov chain Monte Carlo inherently lack amortization; the computational cost of sampling must be paid in-full for each system of interest. The widespread success of generative models has inspired interest into overcoming this limitation through learning sampling algorithms. Despite performing on par with conventional methods when trained on a single system, learned samplers have so far demonstrated limited ability to transfer across systems. We prove that deep learning enables the design of scalable and transferable samplers by introducing Prose, a 280 million parameter all-atom transferable normalizing flow trained on a corpus of peptide molecular dynamics trajectories up to 8 residues in length. Prose draws zero-shot uncorrelated proposal samples for arbitrary peptide systems, achieving the previously intractable transferability across sequence length, whilst retaining the efficient likelihood evaluation of normalizing flows. Through extensive empirical evaluation we demonstrate the efficacy of Prose as a proposal for a variety of sampling algorithms, finding a simple importance sampling-based finetuning procedure to achieve superior performance to established methods such as sequential Monte Carlo on unseen tetrapeptides. We open-source the Prose codebase, model weights, and training dataset, to further stimulate research into amortized sampling methods and finetuning objectives.

Problem

Research questions and friction points this paper is trying to address.

Amortized sampling for molecular conformations across systems

Transferable normalizing flows for zero-shot peptide sampling

Overcoming computational cost limitations in equilibrium sampling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Transferable normalizing flows for molecular sampling

Zero-shot uncorrelated proposal generation for peptides

Importance sampling-based finetuning for superior performance

🔎 Similar Papers

Amortized Posterior Sampling with Diffusion Prior Distillation