🤖 AI Summary
This work proposes MolCrystalFlow, a flow-matching–based generative model for molecular crystal structure prediction that addresses the challenges posed by large molecular size and complex intra- and intermolecular interactions. By treating molecules as rigid bodies, the method decouples intramolecular conformations from intermolecular packing and jointly learns the lattice matrix, molecular orientations, and centroid positions. MolCrystalFlow uniquely integrates geodesic flows on Riemannian manifolds with graph neural networks—the first such approach in molecular crystal generation—to rigorously preserve geometric symmetries. It also enables seamless incorporation of general machine learning potentials to accelerate sampling. Evaluated on two open-source datasets, the model significantly outperforms current state-of-the-art generative and rule-based methods, substantially improving the efficiency of predicting large-scale periodic crystal structures.
📝 Abstract
Molecular crystal structure prediction represents a grand challenge in computational chemistry due to large sizes of constituent molecules and complex intra- and intermolecular interactions. While generative modeling has revolutionized structure discovery for molecules, inorganic solids, and metal-organic frameworks, extending such approaches to fully periodic molecular crystals is still elusive. Here, we present MolCrystalFlow, a flow-based generative model for molecular crystal structure prediction. The framework disentangles intramolecular complexity from intermolecular packing by embedding molecules as rigid bodies and jointly learning the lattice matrix, molecular orientations, and centroid positions. Centroids and orientations are represented on their native Riemannian manifolds, allowing geodesic flow construction and graph neural network operations that respects geometric symmetries. We benchmark our model against state-of-the-art generative models for large-size periodic crystals and rule-based structure generation methods on two open-source molecular crystal datasets. We demonstrate an integration of MolCrystalFlow model with universal machine learning potential to accelerate molecular crystal structure prediction, paving the way for data-driven generative discovery of molecular crystals.