When Active Learning Fails, Uncalibrated Out of Distribution Uncertainty Quantification Might Be the Problem

📅 2025-11-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the root causes of active learning failure in materials discovery, focusing on how uncertainty estimation quality affects model generalization and sampling efficiency under out-of-distribution (OOD) conditions. We systematically evaluate ALIGNN deep ensembles, XGBoost, random forests, and neural networks, leveraging loss-landscape-based uncertainty, predictive variance, and multiple calibration techniques. Results reveal that existing uncertainty calibration strategies exhibit poor generalization to OOD data—calibrated uncertainties often underperform random sampling. Our key contribution is identifying that the primary bottleneck in uncertainty modeling stems not from model architecture or ensemble bias alone, but from insufficient characterization of *empirical uncertainty* in the input feature space. Crucially, the intrinsic OOD nature of the data—not merely algorithmic limitations—is the dominant factor constraining active learning efficacy. These findings call for a paradigm shift toward feature-space-driven uncertainty quantification in future materials informatics research.

Technology Category

Application Category

📝 Abstract
Efficiently and meaningfully estimating prediction uncertainty is important for exploration in active learning campaigns in materials discovery, where samples with high uncertainty are interpreted as containing information missing from the model. In this work, the effect of different uncertainty estimation and calibration methods are evaluated for active learning when using ensembles of ALIGNN, eXtreme Gradient Boost, Random Forest, and Neural Network model architectures. We compare uncertainty estimates from ALIGNN deep ensembles to loss landscape uncertainty estimates obtained for solubility, bandgap, and formation energy prediction tasks. We then evaluate how the quality of the uncertainty estimate impacts an active learning campaign that seeks model generalization to out-of-distribution data. Uncertainty calibration methods were found to variably generalize from in-domain data to out-of-domain data. Furthermore, calibrated uncertainties were generally unsuccessful in reducing the amount of data required by a model to improve during an active learning campaign on out-of-distribution data when compared to random sampling and uncalibrated uncertainties. The impact of poor-quality uncertainty persists for random forest and eXtreme Gradient Boosting models trained on the same data for the same tasks, indicating that this is at least partially intrinsic to the data and not due to model capacity alone. Analysis of the target, in-distribution uncertainty, out-of-distribution uncertainty, and training residual distributions suggest that future work focus on understanding empirical uncertainties in the feature input space for cases where ensemble prediction variances do not accurately capture the missing information required for the model to generalize.
Problem

Research questions and friction points this paper is trying to address.

Evaluating uncertainty estimation methods for active learning in materials discovery
Assessing how uncalibrated uncertainty affects model generalization to out-of-distribution data
Investigating why poor uncertainty quality persists across different machine learning models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Evaluating uncertainty calibration methods for active learning
Comparing ensemble uncertainty estimates across multiple model architectures
Analyzing feature space uncertainties to improve generalization capability
🔎 Similar Papers
No similar papers found.
A
Ashley S. Dale
Department of Materials Science and Engineering, University of Toronto, 27 King’s College Cir, Toronto, ON, Canada
Kangming Li
Kangming Li
Assistant Professor at King Abdullah University of Science and Technology (KAUST)
Materials informaticsfirst principles calculationsmachine learning
B
Brian DeCost
Material Measurement Laboratory, National Institute of Standards and Technology, 100 Bureau Dr, Gaithersburg, MD, USA.
H
Hao Wan
Department of Materials Science and Engineering, University of Toronto, 27 King’s College Cir, Toronto, ON, Canada
Y
Yuchen Han
Department of Materials Science and Engineering, University of Toronto, 27 King’s College Cir, Toronto, ON, Canada
Yao Fehlis
Yao Fehlis
Artificial, Inc.
Computational ChemistryPlasmonicsMachine Learning
Jason Hattrick-Simpers
Jason Hattrick-Simpers
Department of Materials Science and Engineering University of Toronto
artificial intelligenceautonomous sciencecombinatorial materials sciencecompositionally complex alloysmetallic glasses