π€ AI Summary
Accurately predicting the three-dimensional crystal structures of organic molecules is a critical prerequisite for the rational design of functional materials. This paper addresses the crystal assembly problem for rigid molecular ensembles by proposing SinkFastβa regression-based method that, for the first time, unifies geometric awareness and permutation invariance within a differentiable loss function. Leveraging the Sinkhorn algorithm, SinkFast enables end-to-end optimization of molecular permutations without resorting to complex iterative flow matching. Its core innovation lies in a geometrically constrained, permutation-invariant, differentiable linear assignment loss, which significantly improves both modeling efficiency and prediction accuracy. On the COD-Cluster17 benchmark, SinkFast substantially outperforms existing flow-matching models using a markedly simpler architecture, demonstrating superior computational efficiency and generalization capability.
π Abstract
Crystalline structure prediction remains an open challenge in materials design. Despite recent advances in computational materials science, accurately predicting the three-dimensional crystal structures of organic materials--an essential first step for designing materials with targeted properties--remains elusive. In this work, we address the problem of molecular assembly, where a set $mathcal{S}$ of identical rigid molecules is packed to form a crystalline structure. Existing state-of-the-art models typically rely on computationally expensive, iterative flow-matching approaches. We propose a novel loss function that correctly captures key geometric molecular properties while maintaining permutation invariance over $mathcal{S}$. We achieve this via a differentiable linear assignment scheme based on the Sinkhorn algorithm. Remarkably, we show that even a simple regression using our method {em SinkFast} significantly outperforms more complex flow-matching approaches on the COD-Cluster17 benchmark, a curated subset of the Crystallography Open Database (COD).