Benchmarking bandgap prediction in semiconductors under experimental and realistic evaluation settings

📅 2026-04-28
📈 Citations: 0
Influential: 0
📄 PDF

career value

212K/year
🤖 AI Summary
This work addresses the limited generalization of existing machine learning models—trained on density functional theory (DFT) data—to experimentally measured semiconductor band gaps, and the absence of an evaluation framework that jointly considers data fidelity, domain generalization, and interpretability. To bridge this gap, the authors introduce RealMat-BaG, the first benchmark for experimental band gap prediction, which integrates open-source experimental band gaps with their corresponding crystal structures. They systematically evaluate the transferability of graph neural networks and conventional machine learning models from DFT to experimental domains and incorporate interpretability analyses at both elemental and structural levels. Their findings reveal critical generalization bottlenecks in real-world experimental settings and establish a standardized foundation for developing reliable, experiment-aligned materials discovery methods.
📝 Abstract
Accurate bandgap prediction is crucial for semiconductor applications, yet machine learning models trained on computational data often struggle to generalize to experimental bandgap measurements. Challenges related to data fidelity, domain generalization, and model interpretability remain insufficiently addressed in existing evaluation frameworks. To bridge this gap, we introduce RealMat-BaG, a benchmark for assessing model reliability under experimentally relevant conditions. We curate an open-access dataset of experimental bandgaps with aligned crystal structures and compare graph neural networks as well as classical machine learning baselines. Our framework evaluates performance across statistical and domain-based splits, examines transfer from DFT-computed to experimental bandgaps, and analyzes interpretability at both elemental-property and structural levels. Our results reveal the fundamental generalization limitations of current bandgap prediction models and establish a benchmark aligned with experimental measurements for developing more reliable learning strategies for materials discovery.
Problem

Research questions and friction points this paper is trying to address.

bandgap prediction
semiconductors
domain generalization
experimental data
machine learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

bandgap prediction
experimental benchmark
domain generalization
graph neural networks
materials discovery
🔎 Similar Papers
No similar papers found.