🤖 AI Summary
To address the bottlenecks of small scale, sparse labeling, and weak feature representation in experimental databases for intelligent magnetic materials design, this work introduces NEMAD—the first experimentally oriented magnetic materials database automatically constructed using large language models (LLMs), encompassing 26,706 materials with compositional, crystallographic, Curie/Néel temperature, and magnetic property data. We propose an LLM-driven paradigm for automated magnetic data curation, integrating magnetic property-aware feature engineering, graph neural networks, and ensemble regression models—rigorously validated via cross-dataset evaluation on Materials Project. Our approach achieves 90% classification accuracy and R² scores of 0.86 (MAE = 62 K) and 0.85 (MAE = 32 K) for Curie and Néel temperature prediction, respectively, substantially surpassing conventional database limitations in both quality and scale. Furthermore, we successfully identify 62 high-Curie-temperature (>500 K) ferromagnetic and 19 high-Néel-temperature (>100 K) antiferromagnetic candidate materials.
📝 Abstract
The discovery of novel magnetic materials with greater operating temperature ranges and optimized performance is essential for advanced applications. Current data-driven approaches are challenging and limited due to the lack of accurate, comprehensive, and feature-rich databases. This study aims to address this challenge by introducing a new approach that uses Large Language Models (LLMs) to create a comprehensive, experiment-based, magnetic materials database named the Northeast Materials Database (NEMAD), which consists of 26,706 magnetic materials (www.nemad.org). The database incorporates chemical composition, magnetic phase transition temperatures, structural details, and magnetic properties. Enabled by NEMAD, machine learning models were developed to classify materials and predict transition temperatures. Our classification model achieved an accuracy of 90% in categorizing materials as ferromagnetic (FM), antiferromagnetic (AFM), and non-magnetic (NM). The regression models predict Curie (N'eel) temperature with a coefficient of determination (R2) of 0.86 (0.85) and a mean absolute error (MAE) of 62K (32K). These models identified 62 (19) FM (AFM) candidates with a predicted Curie (N'eel) temperature above 500K (100K) from the Materials Project. This work shows the feasibility of combining LLMs for automated data extraction and machine learning models in accelerating the discovery of magnetic materials.