CrystalGym: A New Benchmark for Materials Discovery Using Reinforcement Learning

📅 2025-09-27

📈 Citations: 0

✨ Influential: 0

career value

204K/year

🤖 AI Summary

To address the low efficiency of inverse materials design caused by the high computational cost of density functional theory (DFT), this work introduces the first open-source benchmark environment for crystal materials discovery that enables online reinforcement learning (RL) tightly coupled with real-time DFT calculations. Methodologically, DFT outputs—namely bandgap, bulk modulus, and density—are directly used as sparse, delayed reward signals; value-based and policy-gradient RL algorithms are integrated, and the synergistic role of large language models (LLMs) in action-space modeling and reward shaping is systematically explored. Contributions include: (1) defining a novel RL challenge class tailored to high-cost physical simulations; (2) empirically benchmarking sample efficiency and convergence of diverse RL algorithms under DFT-driven tasks; and (3) releasing a modular, extensible platform to serve as interdisciplinary infrastructure for AI for Science. Experiments demonstrate substantial improvements in the discovery efficiency of target-performance materials.

Technology Category

Application Category

📝 Abstract

In silico design and optimization of new materials primarily relies on high-accuracy atomic simulators that perform density functional theory (DFT) calculations. While recent works showcase the strong potential of machine learning to accelerate the material design process, they mostly consist of generative approaches that do not use direct DFT signals as feedback to improve training and generation mainly due to DFT's high computational cost. To aid the adoption of direct DFT signals in the materials design loop through online reinforcement learning (RL), we propose CrystalGym, an open-source RL environment for crystalline material discovery. Using CrystalGym, we benchmark common value- and policy-based reinforcement learning algorithms for designing various crystals conditioned on target properties. Concretely, we optimize for challenging properties like the band gap, bulk modulus, and density, which are directly calculated from DFT in the environment. While none of the algorithms we benchmark solve all CrystalGym tasks, our extensive experiments and ablations show different sample efficiencies and ease of convergence to optimality for different algorithms and environment settings. Additionally, we include a case study on the scope of fine-tuning large language models with reinforcement learning for improving DFT-based rewards. Our goal is for CrystalGym to serve as a test bed for reinforcement learning researchers and material scientists to address these real-world design problems with practical applications. We therefore introduce a novel class of challenges for reinforcement learning methods dealing with time-consuming reward signals, paving the way for future interdisciplinary research for machine learning motivated by real-world applications.

Problem

Research questions and friction points this paper is trying to address.

Developing RL environment for crystalline material discovery

Benchmarking RL algorithms using DFT-calculated material properties

Addressing time-consuming DFT reward signals in material design

Innovation

Methods, ideas, or system contributions that make the work stand out.

Reinforcement learning environment for crystal discovery

Direct DFT feedback in material design loop

Benchmarking RL algorithms for property optimization

🔎 Similar Papers

Crystalline Material Discovery in the Era of Artificial Intelligence

2024-08-15arXiv.orgCitations: 3

MatText: Do Language Models Need More than Text & Scale for Materials Modeling?

2024-06-25arXiv.orgCitations: 10