🤖 AI Summary
High-entropy alloys (HEAs) suffer from low discovery efficiency due to their vast compositional space and complex phase formation mechanisms. To address this, we propose a multi-source uncertainty fusion method integrating computational materials data with literature-derived textual knowledge. Our approach innovatively combines elemental substitutability modeling with Dempster–Shafer evidence theory to enable robust knowledge transfer and extrapolation under data-scarce scenarios. Furthermore, we incorporate large language model (LLM)-based knowledge distillation to enhance physics-informed priors, yielding a phase stability prediction framework that balances strong generalizability with interpretability. Evaluated on quaternary HEA systems, our method significantly outperforms single-source models and conventional machine learning approaches. Notably, it maintains robust predictive performance even when training data for critical constituent elements are entirely absent—demonstrating both efficacy and practical utility for accelerated HEA discovery.
📝 Abstract
Discovering novel high-entropy alloys (HEAs) with desirable properties is challenging due to the vast compositional space and complex phase formation mechanisms. Efficient exploration of this space requires a strategic approach that integrates heterogeneous knowledge sources. Here, we propose a framework that systematically combines knowledge extracted from computational material datasets with domain knowledge distilled from scientific literature using large language models (LLMs). A central feature of this approach is the explicit consideration of element substitutability, identifying chemically similar elements that can be interchanged to potentially stabilize desired HEAs. Dempster-Shafer theory, a mathematical framework for reasoning under uncertainty, is employed to model and combine substitutabilities based on aggregated evidence from multiple sources. The framework predicts the phase stability of candidate HEA compositions and is systematically evaluated on both quaternary alloy systems, demonstrating superior performance compared to baseline machine learning models and methods reliant on single-source evidence in cross-validation experiments. By leveraging multi-source knowledge, the framework retains robust predictive power even when key elements are absent from the training data, underscoring its potential for knowledge transfer and extrapolation. Furthermore, the enhanced interpretability of the methodology offers insights into the fundamental factors governing HEA formation. Overall, this work provides a promising strategy for accelerating HEA discovery by integrating computational and textual knowledge sources, enabling efficient exploration of vast compositional spaces with improved generalization and interpretability.