🤖 AI Summary
To address the challenges of modeling spatial symmetries, navigating high-dimensional configurational spaces, and achieving data- and compute-efficient crystal generation, this work introduces CrystalFormer—the first autoregressive Transformer explicitly encoding space-group symmetry. It employs Wyckoff position sequences as a discrete representation, decoupling atomic type and coordinate prediction within the unit cell into a symmetry-constrained sequence-generation task. Crucially, space-group information is embedded directly into the generative process, enabling symmetry-aware structure initialization, element substitution, and property-guided design. Experiments demonstrate that CrystalFormer significantly outperforms state-of-the-art crystal prediction tools on symmetry-preserving initialization and element substitution tasks. Moreover, it autonomously learns solid-state chemical priors, yielding data-efficient, interpretable, and plug-and-play crystal generation without requiring task-specific retraining.
📝 Abstract
We introduce CrystalFormer, a transformer-based autoregressive model specifically designed for space group-controlled generation of crystalline materials. The incorporation of space group symmetry significantly simplifies the crystal space, which is crucial for data and compute efficient generative modeling of crystalline materials. Leveraging the prominent discrete and sequential nature of the Wyckoff positions, CrystalFormer learns to generate crystals by directly predicting the species and locations of symmetry-inequivalent atoms in the unit cell. We demonstrate the advantages of CrystalFormer in standard tasks such as symmetric structure initialization and element substitution compared to conventional methods implemented in popular crystal structure prediction software. Moreover, we showcase the application of CrystalFormer of property-guided materials design in a plug-and-play manner. Our analysis shows that CrystalFormer ingests sensible solid-state chemistry knowledge and heuristics by compressing the material dataset, thus enabling systematic exploration of crystalline materials. The simplicity, generality, and flexibility of CrystalFormer position it as a promising architecture to be the foundational model of the entire crystalline materials space, heralding a new era in materials modeling and discovery.