🤖 AI Summary
Molecular crystal structure prediction (CSP) faces dual challenges: an exponentially large conformational search space and minute energy differences among polymorphs (~kJ/mol), rendering traditional DFT or classical force-field methods computationally prohibitive. This work introduces the first end-to-end CSP framework built upon the universal machine-learned potential UMA—requiring no system-specific parameterization—and replaces both empirical force fields and DFT-based re-ranking to enable cross-molecular generalization. The pipeline integrates Genarris 3.0 for stochastic sampling, UMA-driven geometry optimization, and free-energy evaluation, fully automated and open-source. Validated on 28 rigid organic molecules, it successfully recovers all experimentally observed crystal structures within the global energy minimum ±5 kJ/mol window. Each prediction completes in only several hours using tens of GPUs, markedly improving both throughput and accuracy. This framework establishes a scalable, high-fidelity paradigm for rational design of pharmaceuticals and organic electronic materials.
📝 Abstract
Crystal Structure Prediction (CSP) of molecular crystals plays a central role in applications, such as pharmaceuticals and organic electronics. CSP is challenging and computationally expensive due to the need to explore a large search space with sufficient accuracy to capture energy differences of a few kJ/mol between polymorphs. Dispersion-inclusive density functional theory (DFT) provides the required accuracy but its computational cost is impractical for a large number of putative structures. We introduce FastCSP, an open-source, high-throughput CSP workflow based on machine learning interatomic potentials (MLIPs). FastCSP combines random structure generation using Genarris 3.0 with geometry relaxation and free energy calculations powered entirely by the Universal Model for Atoms (UMA) MLIP. We benchmark FastCSP on a curated set of 28 mostly rigid molecules, demonstrating that our workflow consistently generates known experimental structures and ranks them within 5 kJ/mol per molecule of the global minimum. Our results demonstrate that universal MLIPs can be used across diverse compounds without requiring system-specific tuning. Moreover, the speed and accuracy afforded by UMA eliminate the need for classical force fields in the early stages of CSP and for final re-ranking with DFT. The open-source release of the entire FastCSP workflow significantly lowers the barrier to accessing CSP. CSP results for a single system can be obtained within hours on tens of modern GPUs, making high-throughput crystal structure prediction feasible for a broad range of scientific applications.