aLLoyM: A large language model for alloy phase diagram prediction

📅 2025-07-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Alloy phase diagram prediction suffers from low efficiency and heavy reliance on empirical modeling. Method: This work introduces large language models (LLMs) to this task for the first time, proposing a specialized model for capturing composition–temperature–phase relationships. Built upon the Mistral architecture, it integrates data from the CPDDB database and CALPHAD-calculated results to construct a high-quality, question-answering–style training dataset. A dual-objective fine-tuning strategy is employed to jointly train multiple-choice discrimination and short-answer generation models. Contribution/Results: Experiments show significant improvement in multiple-choice accuracy; the short-answer model generates coherent, novel phase diagram descriptions, enabling exploratory prediction for unseen alloy systems. All models and datasets are publicly released on Hugging Face, establishing a new paradigm for intelligent materials design.

Technology Category

Application Category

📝 Abstract
Large Language Models (LLMs) are general-purpose tools with wide-ranging applications, including in materials science. In this work, we introduce aLLoyM, a fine-tuned LLM specifically trained on alloy compositions, temperatures, and their corresponding phase information. To develop aLLoyM, we curated question-and-answer (Q&A) pairs for binary and ternary phase diagrams using the open-source Computational Phase Diagram Database (CPDDB) and assessments based on CALPHAD (CALculation of PHAse Diagrams). We fine-tuned Mistral, an open-source pre-trained LLM, for two distinct Q&A formats: multiple-choice and short-answer. Benchmark evaluations demonstrate that fine-tuning substantially enhances performance on multiple-choice phase diagram questions. Moreover, the short-answer model of aLLoyM exhibits the ability to generate novel phase diagrams from its components alone, underscoring its potential to accelerate the discovery of previously unexplored materials systems. To promote further research and adoption, we have publicly released the short-answer fine-tuned version of aLLoyM, along with the complete benchmarking Q&A dataset, on Hugging Face.
Problem

Research questions and friction points this paper is trying to address.

Predicting alloy phase diagrams using fine-tuned LLMs
Enhancing accuracy in multiple-choice phase diagram questions
Generating novel phase diagrams from alloy components
Innovation

Methods, ideas, or system contributions that make the work stand out.

Fine-tuned Mistral model for alloy phase prediction
Utilized Q&A pairs from CPDDB and CALPHAD
Publicly released model and dataset on Hugging Face
🔎 Similar Papers
No similar papers found.
Y
Yuna Oikawa
Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwa-no-ha, Kashiwa, Chiba, 277-8561, Japan
G
Guillaume Deffrennes
University Grenoble Alpes, CNRS, Grenoble INP, SIMaP, Grenoble, F-38000, France
T
Taichi Abe
Research Center for Structural Materials, National Institute for Materials Science, 1-2-1 Sengen, Tsukuba, Ibaraki, 305-0047, Japan
Ryo Tamura
Ryo Tamura
National Institute for Materials Science
Materials informaticsMachine learningFrustrated spin systemsMagnetocaloric effect
Koji Tsuda
Koji Tsuda
Professor, GSFS, The University of Tokyo
Machine LearningComputational Biology