🤖 AI Summary
Existing crystal generation methods suffer from limited controllability and struggle to effectively explore novel structures beyond the training distribution. This work proposes a composable generative framework grounded in concept learning, introducing for the first time interpretable and reusable crystal building blocks as fundamental generative units. The approach employs a vector-quantized variational autoencoder to automatically extract crystal concepts that integrate local atomic environments with global symmetry, and combines these with a concept composition generator and an iterative self-optimization strategy to enable targeted exploration. By moving beyond conventional black-box random sampling, the method substantially enhances both novelty and validity of generated crystals, achieving 53.2% and 51.7% improvements in the V.S.U.N. metric on the MP-20 and Alex-MP-20 datasets, respectively.
📝 Abstract
De novo crystal generation, a central task in materials discovery, aims to generate crystals that are simultaneously valid, stable, unique, and novel. Existing methods mainly rely on black-box stochastic sampling, providing limited control over how generated structures move beyond the observed distribution. In this paper, we introduce a concept-based compositional framework for crystal generation. We train a vector-quantized variational autoencoder to automatically discover a shared set of reusable crystal concepts, which serve as building blocks for guided generation. These learned concepts naturally exhibit interpretability from both local atomic environments and global symmetry patterns, and generalize to crystals from different distributions. By recombining such concepts, our framework enables controllable exploration of novel crystals beyond the training distribution, rather than relying solely on unconstrained random sampling. To further improve composition efficiency, we introduce a composition generator and iteratively refine it using high-quality samples generated by the model itself. The resulting concept compositions are then used to condition downstream crystal generation. Numerical experiments on MP-20 and Alex-MP-20 show that compositing concepts separately increase base model up to 53.2% and 51.7% on V.S.U.N metric, with particular gains in novelty.