Lang2Str: Two-Stage Crystal Structure Generation with LLMs and Continuous Flow Models

πŸ“… 2026-03-04
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

219K/year
πŸ€– AI Summary
This work addresses the challenge of simultaneously achieving structural plausibility, precision, and flexibility in crystal material design with existing single-stage generative models. To overcome these limitations, we propose Lang2Str, a two-stage generative framework that first leverages a large language model (LLM) to produce structured textual descriptions of crystal geometry and properties, then decodes these descriptions into precise atomic coordinates and lattice parameters using a conditional continuous normalizing flow. By uniquely integrating the LLM’s structured reasoning capabilities with the flow model’s strength in modeling continuous distributions, Lang2Str enables flexible, controllable, and high-fidelity crystal generation. Experiments demonstrate that Lang2Str achieves state-of-the-art performance in both de novo material generation and crystal structure prediction, yielding structures with superior geometric accuracy and energetic stability compared to existing methods.

Technology Category

Application Category

πŸ“ Abstract
Generative models hold great promise for accelerating material discovery but are often limited by their inflexible single-stage generative process in designing valid and diverse materials. To address this, we propose a two-stage generative framework, Lang2Str, that combines the strengths of large language models (LLMs) and flow-based models for flexible and precise material generation. Our method frames the generative process as a conditional generative task, where an LLM provides high-level conditions by generating descriptions of material unit cells' geometric layouts and properties. These descriptions, informed by the LLM's extensive background knowledge, ensure reasonable structure designs. A conditioned flow model then decodes these textual conditions into precise continuous coordinates and unit cell parameters. This staged approach combines the structured reasoning of LLMs and the distribution modeling capabilities of flow models. Experimental results show that our method achieves competitive performance on \textit{ab initio} material generation and crystal structure prediction tasks, with generated structures exhibiting closer alignment to ground truth in both geometry and energy levels, surpassing state-of-the-art models. The flexibility and modularity of our framework further enable fine-grained control over the generation process, potentially leading to more efficient and customizable material design.
Problem

Research questions and friction points this paper is trying to address.

generative models
material discovery
crystal structure generation
validity and diversity
Innovation

Methods, ideas, or system contributions that make the work stand out.

two-stage generation
large language models
flow-based models
crystal structure generation
conditional generative modeling
πŸ”Ž Similar Papers