AtomBench: A Benchmark for Generative Atomic Structure Models using GPT, Diffusion, and Flow Architectures

📅 2025-10-17
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
A systematic, comparable evaluation of generative models for materials structure generation remains lacking. Method: We introduce the first standardized benchmark platform for crystal structure generation and conduct the first unified evaluation of three representative architectures—Transformer (AtomGPT), diffusion models (CDVAE), and Riemannian manifold matching (FlowMM)—on two superconducting material datasets (JARVIS Supercon 3D and Alexandria). We propose a quantitative evaluation framework based on KL divergence and mean absolute error (MAE) to ensure reproducibility and fair comparison. Results: CDVAE achieves the highest structural reconstruction accuracy, followed by AtomGPT and then FlowMM. All code, configuration files, and evaluation protocols will be publicly released to advance standardization in generative materials discovery.

Technology Category

Application Category

📝 Abstract
Generative models have become significant assets in the exploration and identification of new materials, enabling the rapid proposal of candidate crystal structures that satisfy target properties. Despite the increasing adoption of diverse architectures, a rigorous comparative evaluation of their performance on materials datasets is lacking. In this work, we present a systematic benchmark of three representative generative models- AtomGPT (a transformer-based model), Crystal Diffusion Variational Autoencoder (CDVAE), and FlowMM (a Riemannian flow matching model). These models were trained to reconstruct crystal structures from subsets of two publicly available superconductivity datasets- JARVIS Supercon 3D and DS A/B from the Alexandria database. Performance was assessed using the Kullback-Leibler (KL) divergence between predicted and reference distributions of lattice parameters, as well as the mean absolute error (MAE) of individual lattice constants. For the computed KLD and MAE scores, CDVAE performs most favorably, followed by AtomGPT, and then FlowMM. All benchmarking code and model configurations will be made publicly available at https://github.com/atomgptlab/atombench_inverse.
Problem

Research questions and friction points this paper is trying to address.

Benchmarking generative models for atomic structure prediction
Comparing transformer, diffusion and flow architectures systematically
Evaluating model performance on superconductivity crystal datasets
Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarking three generative atomic structure models
Evaluating performance using KL divergence and MAE
Comparing transformer, diffusion, and flow architectures
🔎 Similar Papers
No similar papers found.
C
Charles Rhys Campbell
Department of Physics and Astronomy, West Virginia University, Morgantown, WV 26506, USA
A
Aldo H. Romero
Department of Physics and Astronomy, West Virginia University, Morgantown, WV 26506, USA
Kamal Choudhary
Kamal Choudhary
Johns Hopkins University
Computational Material ScienceMachine learningQuantum simulationsMaterials designMaterials