Predicting band gap from chemical composition: A simple learned model for a material property with atypical statistics

📅 2025-01-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Accurately predicting electronic bandgaps of crystalline materials solely from chemical composition remains challenging, especially given the non-Gaussian, multimodal nature of bandgap distributions and the absence of structural information. Method: This work proposes a lightweight, highly interpretable hybrid random-variable modeling framework: bandgaps are modeled as nonnegative weighted averages of element-specific contributions—each parameterized by a single, physically meaningful scalar—subject to a truncation constraint. Crucially, the method formalizes bandgap prediction as a mixture-variable regression task, deliberately omitting crystal-structure inputs and relying exclusively on elemental descriptors. Contribution/Results: On diverse inorganic crystal datasets, the model achieves accuracy competitive with state-of-the-art structure-aware deep learning models, while reducing parameter count by three orders of magnitude. Learned elemental parameters exhibit strong chemical consistency with established physical quantities—including electronegativity and orbital energy levels—demonstrating simultaneous excellence in predictive performance, interpretability, and physical plausibility.

Technology Category

Application Category

📝 Abstract
In solid-state materials science, substantial efforts have been devoted to the calculation and modeling of the electronic band gap. While a wide range of ab initio methods and machine learning algorithms have been created that can predict this quantity, the development of new computational approaches for studying the band gap remains an active area of research. Here we introduce a simple machine learning model for predicting the band gap using only the chemical composition of the crystalline material. To motivate the form of the model, we first analyze the empirical distribution of the band gap, which sheds new light on its atypical statistics. Specifically, our analysis enables us to frame band gap prediction as a task of modeling a mixed random variable, and we design our model accordingly. Our model formulation incorporates thematic ideas from chemical heuristic models for other material properties in a manner that is suited towards the band gap modeling task. The model has exactly one parameter corresponding to each element, which is fit using data. To predict the band gap for a given material, the model computes a weighted average of the parameters associated with its constituent elements and then takes the maximum of this quantity and zero. The model provides heuristic chemical interpretability by intuitively capturing the associations between the band gap and individual chemical elements.
Problem

Research questions and friction points this paper is trying to address.

Electronic Band Gap Prediction
Solid State Materials
Chemical Composition
Innovation

Methods, ideas, or system contributions that make the work stand out.

Electronic Band Gap Prediction
Chemical Composition-based Model
Data-driven Elemental Values
🔎 Similar Papers
No similar papers found.
Andrew Ma
Andrew Ma
Massachusetts Institute of Technology
materials sciencecondensed mattercomputational physicsmachine learning
Owen Dugan
Owen Dugan
Stanford CS PhD Candidate
M
Marin Soljačić
Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA