๐ค AI Summary
Existing crystal generation methods face a fundamental trade-off: large language models (LLMs) excel at modeling discrete atomic types but struggle with continuous structural variables (e.g., atomic coordinates, lattice parameters), whereas equivariant diffusion models handle continuous geometry effectively but lack compositional reasoning. To bridge this gap, we propose CrysLLMGenโthe first hybrid framework integrating LLMs and equivariant diffusion models. In our approach, the LLM generates atomic types and a coarse initial structure, while the equivariant diffusion model refines atomic positions and lattice parameters in an SE(3)-equivariant manner. This design enables joint modeling of discrete composition and continuous geometry, supporting user-specified conditional generation. On multiple benchmarks, CrysLLMGen achieves significant improvements in structural validity, compositional accuracy, thermodynamic stability, and novelty, while demonstrating strong conditional controllability. Our work establishes a new paradigm for data-driven crystal material discovery.
๐ Abstract
Recent advances in generative modeling have shown significant promise in designing novel periodic crystal structures. Existing approaches typically rely on either large language models (LLMs) or equivariant denoising models, each with complementary strengths: LLMs excel at handling discrete atomic types but often struggle with continuous features such as atomic positions and lattice parameters, while denoising models are effective at modeling continuous variables but encounter difficulties in generating accurate atomic compositions. To bridge this gap, we propose CrysLLMGen, a hybrid framework that integrates an LLM with a diffusion model to leverage their complementary strengths for crystal material generation. During sampling, CrysLLMGen first employs a fine-tuned LLM to produce an intermediate representation of atom types, atomic coordinates, and lattice structure. While retaining the predicted atom types, it passes the atomic coordinates and lattice structure to a pre-trained equivariant diffusion model for refinement. Our framework outperforms state-of-the-art generative models across several benchmark tasks and datasets. Specifically, CrysLLMGen not only achieves a balanced performance in terms of structural and compositional validity but also generates more stable and novel materials compared to LLM-based and denoisingbased models Furthermore, CrysLLMGen exhibits strong conditional generation capabilities, effectively producing materials that satisfy user-defined constraints. Code is available at https://github.com/kdmsit/crysllmgen