🤖 AI Summary
Alloy phase diagram prediction suffers from low efficiency and heavy reliance on empirical modeling. Method: This work introduces large language models (LLMs) to this task for the first time, proposing a specialized model for capturing composition–temperature–phase relationships. Built upon the Mistral architecture, it integrates data from the CPDDB database and CALPHAD-calculated results to construct a high-quality, question-answering–style training dataset. A dual-objective fine-tuning strategy is employed to jointly train multiple-choice discrimination and short-answer generation models. Contribution/Results: Experiments show significant improvement in multiple-choice accuracy; the short-answer model generates coherent, novel phase diagram descriptions, enabling exploratory prediction for unseen alloy systems. All models and datasets are publicly released on Hugging Face, establishing a new paradigm for intelligent materials design.
📝 Abstract
Large Language Models (LLMs) are general-purpose tools with wide-ranging applications, including in materials science. In this work, we introduce aLLoyM, a fine-tuned LLM specifically trained on alloy compositions, temperatures, and their corresponding phase information. To develop aLLoyM, we curated question-and-answer (Q&A) pairs for binary and ternary phase diagrams using the open-source Computational Phase Diagram Database (CPDDB) and assessments based on CALPHAD (CALculation of PHAse Diagrams). We fine-tuned Mistral, an open-source pre-trained LLM, for two distinct Q&A formats: multiple-choice and short-answer. Benchmark evaluations demonstrate that fine-tuning substantially enhances performance on multiple-choice phase diagram questions. Moreover, the short-answer model of aLLoyM exhibits the ability to generate novel phase diagrams from its components alone, underscoring its potential to accelerate the discovery of previously unexplored materials systems. To promote further research and adoption, we have publicly released the short-answer fine-tuned version of aLLoyM, along with the complete benchmarking Q&A dataset, on Hugging Face.