PRISMat: Policy-Driven, Permutation-Invariant Autoregressive Material Generation

📅 2026-05-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

220K/year
🤖 AI Summary
This work addresses the high computational cost and low efficiency of large language models in materials generation by proposing a lightweight, permutation-invariant autoregressive generative model. For the first time, it integrates permutation invariance with a policy-driven mechanism to directly model the mapping between material compositions and target surface properties—such as cleavage energy and work function—bypassing conventional sequential modeling paradigms. This approach substantially reduces computational overhead while enhancing generation accuracy, achieving mean absolute errors of 0.188 eV/Ų for cleavage energy and 2.79 eV for work function, representing a fourfold reduction in error compared to the next-best model.
📝 Abstract
Rapid identification of candidate materials with target properties has become a key task in materials science. Machine learning has emerged as an alternative to physics-based simulation, offering a faster and cheaper way to filter materials based on their stability and other target properties, reducing the number of candidates that reach the costly synthesis stage. Recently, Large Language Models (LLMs) have been applied to this role, but these models are parameter-heavy and computationally expensive both during training and at inference time, making them unsuitable for high-throughput tasks. This inefficiency stems from both the large over-parameterization of language models and the difficulty of framing material generation as a sequence learning problem. In this paper, we present PRISMat, a cost-effective, permutation-invariant model, which addresses these limitations. We show that PRISMat, despite taking less time for inference, is able to outperform LLMs in generating crystal slabs conditioned on critical materials' surface properties. In targeted material discovery, we achieve mean absolute errors of 0.188 eV/A$^2$ and 2.79 eV for cleavage energy and work function tasks, respectively, reducing the error of the next best model by 4$\times$.
Problem

Research questions and friction points this paper is trying to address.

material generation
high-throughput screening
machine learning
large language models
target properties
Innovation

Methods, ideas, or system contributions that make the work stand out.

permutation-invariant
autoregressive generation
material discovery
efficient modeling
property-conditioned synthesis