AtomBench: A Benchmark for Generative Atomic Structure Models using GPT, Diffusion, and Flow Architectures

📅 2025-10-17

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

A systematic, comparable evaluation of generative models for materials structure generation remains lacking. Method: We introduce the first standardized benchmark platform for crystal structure generation and conduct the first unified evaluation of three representative architectures—Transformer (AtomGPT), diffusion models (CDVAE), and Riemannian manifold matching (FlowMM)—on two superconducting material datasets (JARVIS Supercon 3D and Alexandria). We propose a quantitative evaluation framework based on KL divergence and mean absolute error (MAE) to ensure reproducibility and fair comparison. Results: CDVAE achieves the highest structural reconstruction accuracy, followed by AtomGPT and then FlowMM. All code, configuration files, and evaluation protocols will be publicly released to advance standardization in generative materials discovery.

Technology Category

Application Category

📝 Abstract

Generative models have become significant assets in the exploration and identification of new materials, enabling the rapid proposal of candidate crystal structures that satisfy target properties. Despite the increasing adoption of diverse architectures, a rigorous comparative evaluation of their performance on materials datasets is lacking. In this work, we present a systematic benchmark of three representative generative models- AtomGPT (a transformer-based model), Crystal Diffusion Variational Autoencoder (CDVAE), and FlowMM (a Riemannian flow matching model). These models were trained to reconstruct crystal structures from subsets of two publicly available superconductivity datasets- JARVIS Supercon 3D and DS A/B from the Alexandria database. Performance was assessed using the Kullback-Leibler (KL) divergence between predicted and reference distributions of lattice parameters, as well as the mean absolute error (MAE) of individual lattice constants. For the computed KLD and MAE scores, CDVAE performs most favorably, followed by AtomGPT, and then FlowMM. All benchmarking code and model configurations will be made publicly available at https://github.com/atomgptlab/atombench_inverse.

Problem

Research questions and friction points this paper is trying to address.

Benchmarking generative models for atomic structure prediction

Comparing transformer, diffusion and flow architectures systematically

Evaluating model performance on superconductivity crystal datasets

Innovation

Methods, ideas, or system contributions that make the work stand out.

Benchmarking three generative atomic structure models

Evaluating performance using KL divergence and MAE

Comparing transformer, diffusion, and flow architectures

🔎 Similar Papers

No similar papers found.