How Far Can You Grow? Characterizing the Extrapolation Frontier of Graph Generative Models for Materials Science

📅 2026-02-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the lack of systematic evaluation of extrapolation capabilities in crystalline generative models beyond their training scales. To this end, the authors construct RADII—a benchmark dataset of approximately 75,000 nanoparticle structures with continuously tunable radii—and introduce the novel concept of an “extrapolation frontier.” They establish a leakage-free, radius-resolved evaluation framework incorporating surface–interior decomposition and multi-metric failure sequence analysis. The findings reveal that all models exhibit an average 13% increase in global positional error beyond the training radius, while local bonding fidelity varies significantly across models. Moreover, extrapolation errors follow a one-third power-law scaling relationship. This work establishes output scale as a critical evaluation dimension for geometric generative models and provides a quantitative benchmark for assessing the extrapolation performance of materials generative models.

Technology Category

Application Category

📝 Abstract
Every generative model for crystalline materials harbors a critical structure size beyond which its outputs quietly become unreliable -- we call this the extrapolation frontier. Despite its direct consequences for nanomaterial design, this frontier has never been systematically measured. We introduce RADII, a radius-resolved benchmark of ${\sim}$75,000 nanoparticle structures (55-11,298 atoms) that treats radius as a continuous scaling knob to trace generation quality from in-distribution to out-of-distribution regimes under leakage-free splits. RADII provides frontier-specific diagnostics: per-radius error profiles pinpoint each architecture's scaling ceiling, surface-interior decomposition tests whether failures originate at boundaries or in bulk, and cross-metric failure sequencing reveals which aspect of structural fidelity breaks first. Benchmarking five state-of-the-art architectures, we find that: (i) all models degrade by ${\sim}13\%$ in global positional error beyond training radii, yet local bond fidelity diverges wildly across architectures -- from near-zero to over $2\times$ collapse; (ii) no two architectures share the same failure sequence, revealing the frontier as a multi-dimensional surface shaped by model family; and (iii) well-behaved models obey a power-law scaling exponent $\alpha \approx 1/3$ whose in-distribution fit accurately predicts out-of-distribution error, making their frontiers quantitatively forecastable. These findings establish output scale as a first-class evaluation axis for geometric generative models. The dataset and code are available at https://github.com/KurbanIntelligenceLab/RADII.
Problem

Research questions and friction points this paper is trying to address.

extrapolation frontier
graph generative models
materials science
nanoparticle generation
out-of-distribution generalization
Innovation

Methods, ideas, or system contributions that make the work stand out.

extrapolation frontier
graph generative models
RADII benchmark
scale-aware evaluation
power-law scaling